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1 . 0  INTRODUCTION 


This  report  is  a  summary  of  the  work  carried  out  by  the 
Scientific  Studies  Corporation  (SSC)  team  in  Phase  II  of  the  "Two- 
Dimensional  Processing  for  Radar  Systems"  Small  Business 
Innovation  Research  (SBIR)  program  for  the  U.  S.  Air  Force 
Research  Laboratory,  Sensors  Directorate,  Rome  Research  Site 
(AFRL/SNRT).  As  such,  the  report  includes  also  the  work  carried 
out  by  the  University  of  Central  Florida  (UCF)  under  contract  to 
SSC.  Dr.  Qingwen  Zhang,  post-doctoral  researcher  at  UCF,  and 
Prof.  W.  B.  Mikhael,  Chairman  of  the  Electrical  and  Computer 
Engineering  Department  at  UCF,  executed  the  subcontract  to  SSC, 
and  their  efforts  are  hereby  acknowledged. 

The  work  reported  herein  was  carried  out  in  the  context  of 
space-time  adaptive  processing  (STAP)  for  airborne  surveillance 
radar  systems  in  support  of  an  in-house  research  effort  at 
AFRL/SNRT.  However,  this  work  has  application  in  other  areas, 
such  as  communication  systems,  active  sonar  array  systems,  optical 
sensor  systems,  non-destructive  inspection  (NDI)  systems, 
geophysical  array  systems,  mine  detection  systems,  and  medical 
technology. 

This  report  covers  two  related,  but  distinct,  technical 
thrusts  of  the  program.  The  first  thrust  involves  a  generic  STAP 
architecture  that  covers  a  wide  variety  of  algorithms  and  their 
associated  detection  rules.  In  particular,  this  architecture 
covers  the  classical  matched  filter  (MF) ,  as  well  as  three  new 
algorithms  for  the  calculation  of  STAP  weights.  These  novel 
algorithms  are  low-dimensionality  options  to  the  MF,  and 
constitute  a  major  technical  contribution  of  the  program. 
Furthermore,  these  algorithms  admit  straightforward  adaptive 
configurations  for  the  unknown- covariance  case,  just  as  the 
adaptive  matched  filter  (AMF)  is  the  unknown- covariance 
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configuration  of  the  MF .  All  three  new  algorithms  can  be 
configured  in  a  variety  of  ways,  based  on  the  selection  of  key 
parameters,  and  the  generic  architecture  admits  all  such 
variations.  Thus,  the  generic  architecture  provides  new  insights 
into  the  structure  of  STAP  algorithms  and  detection  rules.  This 
architecture  constitutes  another  technical  contribution  of  the 
program . 

The  second  technical  thrust  involves  the  model-based 
multichannel  detection  formulation  in  the  context  of  a  two- 
dimensional  (2-D)  representation  for  space-time  processes  in 
general,  and  airborne  surveillance  phased  array  radar  systems  in 
particular.  For  the  phased  array  STAP  and  detection  problem,  such 
a  formulation  requires  a  2-D  parametric  model  for  the  channel 
output  process  under  the  target-absent  hypothesis.  The  identified 
2-D  parametric  model  is  used  to  design  a  whitening  filter  for  the 
interference  process,  as  required  for  the  parametric  adaptive 
matched  filter  (PAMF)  methodology  pioneered  by  Rangaswamy  and 
Michels  (1997)  .  Formulation  of  a  2-D  PAMF  methodology  and  its 
algorithmic  implementation  constitute  a  third  technical 
contribution  of  the  program.  The  2-D  PAMF  offers  major  advantages 
over  the  2-D  innovations-based  detection  architecture  (IBDA) 
methodology  formulated  and  demonstrated  in  Phase  I  of  this  two- 
phase  program  in  the  context  of  2-D  STAP  (Roman  and  Davis,  1997)  . 
The  2-D  IBDA  introduced  in  Phase  I  is  a  2-D  formulation  of  the 
IBDA  methodology  pioneered  by  Metford  and  Haykin  (1985)  for  the 
scalar  one -dimensional  (1-D)  case,  and  extended  by  Michels  (1991) 
to  the  multichannel  1-D  case.  Advantages  of  the  2-D  PAMF  over  the 
2-D  IBDA  include  a  simpler  structure,  since  a  whitening  filter  for 
the  target  process  is  unnecessary.  Another  advantage  is  the 
availability  of  multiple  detection  rules  that  can  be  utilized  in 
the  context  of  an  extended  2-D  PAMF.  Each  alternative  detection 
rule  offers  distinct  features,  such  as  constant  false  alarm  rate 
(CFAR)  detection. 
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1 . 1  Airborne  Surveillance  Phased  Array  Radar  Problem  Statement 


A  state-of-the-art  phased  array  radar  surveillance  platform 
in  a  modern  scenario  has  multiple  functions,  including  target 
detection,  target  tracking  and  track  association,  and  possibly 
target  identification.  The  focus  herein  is  on  moving  target 
detection,  which  is  a  precursor  to  the  other  functions.  A  typical 
airborne  surveillance  radar  scenario  involves  a  linear  array  radar 
consisting  of  J  equally-spaced,  identical  antenna  elements  (or 
identical  beamformed  subarrays)  in  a  side-looking  configuration  on 
an  airborne  platform  moving  at  a  constant  speed  in  level  flight. 
The  array  is  aligned  with  the  aircraft's  longitudinal  axis,  and 
the  aircraft  velocity  makes  a  crab  angle  y  with  the  aircraft's 
longitudinal  axis.  The  radar  array  is  radiating  a  coherent  pulse 
train  of  N-pulse  duration,  at  a  constant  radiation  frequency,  and 
at  a  constant  pulse  repetition  frequency  (PRF) .  Each  antenna 
element  (or  group  of  elements)  is  referred  to  as  a  channel,  and 
the  ith  channel  output  (after  pulse  compression,  demodulation,  and 
sampling)  corresponding  to  a  single  range  resolution  cell  (gate) 
is  a  complex-valued  discrete-time  sequence  denoted  as  {Xi(n)|n  =  0, 1... 
. ,  N-1}.  The  J  scalar  sequences  are  concatenated  to  form  a  vector 
sequence  {x(n)  I  n  =  0,  1,  .  .  .  ,  N-1}.  Process  {x(n)}  is  assumed  to  be 
stationary,  ergodic,  zero-mean,  and  Gauss ian-distributed. 

The  received  signal  in  an  airborne  surveillance  phased  array 
radar  can  be  referred  to  as  a  space- time  process,  wherein  the 
spatial  connotation  arises  from  the  spatial  diversity  of  the 
antenna  array  elements.  In  general,  this  received  signal  contains 
a  moving  target  component,  as  well  as  receiver  (broadband)  noise, 
jammer  noise  (broadband  interference),  and  ground  clutter 
(narrowband  interference)  components.  The  system's  objective  is 
to  detect  the  moving  target  given  the  noise-  and  interference- 
corrupted  received  signal.  This  problem  admits  a  dual -hypothesis 
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formulation,  with  Hq  representing  the  null  hypothesis  (target  not 
present),  and  Hi  representing  the  alternative  hypothesis  (target 
present) .  In  a  typical  scenario,  the  moving  target  detection 
processor  makes  a  detection  decision  each  coherent  processing 
interval  (CPI)  for  each  range  gate,  which  requires  processing  a 
finite-duration  sequence  {x(n)  I  n  =  0,  1, ,  N-1}  for  each  range  gate. 

To  date,  most  of  the  development  and  analysis  of  optimum 
joint-domain  adaptive  algorithms  and  sub-optimum  block- covariance 
algorithms  for  STAP  involves  the  1-D  multichannel  (vector) 
representation  of  the  radar  space-time  signal.  Such  past  work 
encompasses  target  detection  and  interference  rejection  in 
airborne  surveillance  radar  arrays,  as  represented  in  the  work  of 
Brennan  and  Reed  (1973),  Jaffer  et  al .  (1991),  and  Ward  (1994). 
The  algorithms  discussed  by  these  authors  are  referred  to  often  as 
the  conventional  (or  classical)  STAP  algorithms.  More  recently, 
Michels  (1991)  and  Roman  and  Davis  (1993a;  1993b)  have  adopted  the 
1-D  vector  representation  using  multichannel  auto-regressive  (AR) 
and  state-space  models,  respectively,  for  joint-domain 
innovations -based  detection. 

1 • 2  Maximinn  Correlation  And  Other  STAP  And  Detection  Methods  In 

The  Context  Of  A  Generic  Detection  Architecture 


A  summary  of  each  novel  STAP  algorithm  is  provided  next.  In 
the  first  new  STAP  algorithm  the  principle  of  orthogonal 
projections  (OP)  is  applied  to  generate  a  sequence  of  residuals. 
This  OP  algorithm  is  distinct  from  the  least-squares  predictive- 
transform  (LSPT)  algorithm  of  Guerci  and  Feria  (1996)  in  three 
important  aspects.  First,  the  LSPT  includes  a  transform  step, 
whereas  the  OP  does  not.  Second,  the  LSPT  is  formulated  as  the 
optimal  solution  to  the  improvement  factor  criterion,  which  leads 
to  a  detection  rule  involving  the  minimum  eigenvalue  of  the 
residual  (prediction  innovation)  covariance  matrix.  This  minimum 
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eigenvalue  criterion  is  associated  naturally  with  the  transform 
step.  In  contrast,  the  OP  utilizes  the  MF  detection  rule,  which 
involves  inner  products  of  the  residuals.  Third,  the  LSPT  is  a 
block  processing  method  in  the  predictive  step  as  well  as  in  the 
transform  step.  In  the  predictive  step  of  the  LSPT  (the  step  in 
common  with  the  OP  method),  the  full -dimension  (spatial  and 
temporal)  array  output  vector  is  segmented  into  two  sub-vectors, 
and  one  sub-vector  is  utilized  to  predict  the  other.  The 
dimensions  of  the  two  sub-vectors  add  up  to  the  full  dimension  of 
the  array  output  vector  (from  the  dimensionality-reduction  stand¬ 
point,  a  wise  choice  is  to  select  each  of  the  two  sub-vectors  to 
be  of  dimension  equal  to  one-half  the  full -dimension ,  as 
considered  in  Section  4.2  for  one  option  of  the  OP  algorithm) .  In 
contrast,  in  its  most  general  form  the  OP  is  a  hybrid  processing 
method,  with  block-processing  as  well  as  sequential-processing 
characteristics.  This  is  a  consequence  of  the  fact  that  the 
dimensions  of  each  of  the  two  sub-vectors  can  be  selected  such 
that  their  sum  is  less  than  the  full-dimension.  Via  this 
mechanism  the  OP  method  introduced  herein  attains  a  higher  level 
of  dimensionality  reduction  than  the  LSPT. 

One  particular  configuration  of  the  OP  algorithm  leads  to  an 
implementation  identical  with  the  parametric  adaptive  matched 
filter  (PAMF)  using  the  Yule-Walker  (YW)  and  Levinson-Durbin  (LD) 
AR  model  identification  algorithms  (Section  4.1) . 

The  second  novel  algorithm  is  based  on  the  concept  of 
canonical  variables  and  canonical  correlations.  This  algorithm  is 
referred  to  as  maximum  correlation  (MC)  since  it  is  based  on 
maximization  of  the  correlation  between  the  "past"  and  the 
"future"  of  the  array  output  random  process.  The  MC  method  is  an 
extension  of  earlier  work  by  SSC  in  the  specific  context  of 
sidelobe  canceling  for  adaptive  arrays  (Roman,  1998a)  . 
Specifically,  Roman  (1998a)  demonstrated  that  the  conventional 
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sidelobe  canceler  problem  can  be  formulated  in  the  context  of 
canonical  correlations,  and  the  solution  attained  is  equivalent  to 
the  well-known  minimum  mean- square -error  solution.  It  is  possible 
to  show  that  the  MC  approach  is  equivalent  also  to  the  reduced- 
dimension  generalized  sidelobe  canceler  (GSC)  formulated  by- 
Goldstein  and  Reed  (1997) .  Since  the  multistage  Wiener  filter 
(MWF)  (Goldstein  et  al . ,  1998)  is  a  sequence  of  orthogonal 
decompositions  of  the  same  form  as  the  GSC,  it  is  reasonable  to 
expect  that  an  equivalence  exists  between  the  MC  method  presented 
herein  (which  is  a  generalization  of  the  MC  method  for  the 
sidelobe  canceler)  and  the  MWF.  Further  investigation  of  this 
issue  remains  for  future  work. 

The  MC  algorithm  is  a  hybrid  processing  method,  with  block¬ 
processing  and  sequential-processing  characteristics.  The  method 
also  allows  for  configurations  with  significant  reduction  in 
dimensionality,  in  relation  to  the  MF  and  the  LSPT.  This  is  due 
to  the  fact  that  the  MC  algorithm  has  two  mechanisms  for  attaining 
reductions  in  dimensionality.  In  addition,  the  MC  formulation 
results  in  simple  expressions  for  entropy  and  mutual  information 
of  the  array  output  process  (Section  5.1).  These  concepts 
facilitate  the  development  of  probabilistic  criteria  for 
dimensionality  reduction  (Section  5.2). 

The  third  new  algorithm  consists  of  the  application  of  the 
orthogonal  projection  principle  to  the  array  output  data  after 
transformation  onto  the  canonical  variables  basis,  and  therefore 
is  referred  to  as  orthogonal  projection  using  canonical  variables 
(OPCV) .  This  algorithm  combines  the  best  features  of  the  OP  and 
MC  methods,  and  is  more  powerful  than  either.  In  particular,  the 
OPCV  includes  an  optimal  mechanism  for  dimensionality  reduction  of 
the  data  residual,  whereas  the  OP  lacks  such  a  mechanism. 
Furthermore,  the  OPCV  data  residual  is  less  than  or  equal  to  the 
MC  data  residual  for  all  possible  configurations.  Beyond  these 
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features,  the  OPCV  provides  unique  insight  into  the  structure  of 
orthogonal  projection  in  the  context  of  STAP. 

All  three  algorithms  (OP;  MC;  OPCV)  admit  configurations  that 
are  fully  optimal  with  respect  to  the  associated  criterion,  but 
their  most  important  practical  aspect  is  to  provide  alternatives 
to  the  optimum  MF  in  their  reduced-dimensionality  configurations. 

The  algorithmic  discussion  presented  herein  is  from  the 
analytical  point  of  view,  as  opposed  to  the  computational  point  of 
view.  That  is,  a  formulation  is  presented  for  each  algorithm, 
without  addressing  the  issue  of  numerical  implementation.  Also, 
all  algorithmic  formulations  are  presented  for  the  known- 
covariance  case  in  order  to  emphasize  the  theoretical  aspects,  as 
well  as  to  simplify  the  notation.  However,  each  algorithm  can  be 
implemented  adaptively,  and  adaptive  issues  are  considered  briefly 
for  each  algorithm.  An  important  issue  in  the  context  of  adaptive 
systems  is  that  dimensionality  reduction  leads  to  a  reduction  in 
secondary  data  requirements,  and/or  an  increase  in  detection 
performance,  in  relation  to  the  AMF. 

1-3  STAP  and  Detection  Via  the  Two-Dimensional  Representation 

As  stated  previously,  prior  work  in  model -based  STAP  has  been 
focused  on  the  1-D  vector  representation  of  the  channel  output 
process.  Alternatively,  a  space-time  signal  can  be  represented  as 
a  2-D  scalar  process.  The  2-D  representation  is  essential  for 
visualizing  and  understanding  the  spectral  energy  and  correlation 
characteristics  of  the  total  signal,  and  of  the  ground  clutter 
component  in  particular.  This  2-D  representation  also  offers 
algorithmic  and  modeling  advantages.  With  respect  to  model -based 
detection  for  the  airborne  surveillance  phased  array  radar  STAP 
problem,  1-D  models  offer  dynamic  and  static  degrees  of  freedom  in 
the  temporal  axis,  but  only  static  degrees  of  freedom  in  the 
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spatial  axis.  In  contrast,  2-D  models  offer  dynamic  and  static 
degrees  of  freedom  in  both  axes  (space  and  time) .  Thus,  a  model - 
based  detection  methodology  using  2-D  scalar  models  is  inherently 
better-suited  to  the  cases  wherein  channel-to-channel  correlation 
exhibits  a  complicated  structure.  Such  cases  occur  physically 
when  the  platform  and  scenario  parameters  lead  to  a  non- integer¬ 
valued  clutter  ridge  slope  parameter,  and/or  to  non-zero 
misalignment  between  the  array  longitudinal  axis  and  the  platform 
velocity  vector,  and/or  to  non-zero  internal  clutter  motion. 
These  conditions  are  common  in  airborne  surveillance  radar 
scenarios . 

The  2-D,  least-square,  frequency  domain  (2D-LS-FD)  technique 
formulated  by  Mikhael  and  Yu  (1994)  was  adopted  in  Phase  I  as  the 
baseline  2-D  model  identification  method.  This  technique 
approximates  a  given  2-D  complex-valued  field  with  a  2-D  rational 
function  model  of  the  auto-regressive  moving- average  (ARMA)  class. 
Of  course,  the  ARMA  class  includes  the  AR  and  the  moving- average 
(MA)  classes.  In  the  2D-LS-FD  technique  the  ARMA  coefficients  are 
obtained  as  the  solution  to  a  set  of  linear  equations.  Excellent 
results  have  been  obtained  in  the  image  noise  canceling  problem 
(Mikhael  and  Yu,  1994),  which  is  related  to  clutter  cancellation 
in  radar  space- time  processing.  The  2D-LS-FD  technique  was 
applied  successfully  in  Phase  I  to  clutter  modeling.  However, 
detection  results  obtained  in  Phase  II  using  the  2D-LS-FD 
technique  in  the  context  of  a  2-D  PAMF  were  very  poor  in  relation 
to  those  obtained  with  other  methods,  and  were  significantly  below 
those  obtained  using  the  AMF  implemented  via  sample  matrix 
inversion.  This  poor  performance  is  a  result  of  the  high  noise 
present  in  the  frequency-domain  representation  (magnitude  as  well 
as  phase)  of  the  channel  output  process,  as  required  by  the 
algorithm  (in  the  2D-LS-FD,  model  identification  is  carried  out  in 
the  frequency  domain) .  As  a  consequence,  other  2-D  model 
identification  techniques  were  considered.  UCF  and  SSC  settled  on 
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the  2-D  AR  least-squares  (2DARLS)  algorithm  as  the  preferred 
technique  to  apply  in  the  context  of  the  2-D  PAMF.  The  2DARLS  is 
conceptually  simple  and  provides  robust  model  identification 
performance.  For  example,  the  closely-related  1-D  multichannel  AR 
LS  algorithm  has  out-performed  a  suite  of  algorithms  in  recent 
studies  (Roman  et  al .  ,  2000).  In  addition,  the  2DARLS  admits 
various  numerically  stable  and  efficient  software  implementations 
(Appendix  A) .  The  2-D  PAMF  formulation  and  its  architecture  using 
the  2DARLS  identification  algorithm  is  another  major  contribution 
of  this  program. 

1.4  Report  Overview 

The  generic  PAMF  architecture  is  introduced  first  in  Section 
2.0,  along  with  notation  and  definitions  that  are  used  in  other 
sections.  Then  the  MF,  OP,  MC,  and  OPCV  STAP  algorithms  are 
summarized  in  Sections  3.0  through  6.0,  respectively.  Model-based 
multichannel  detection  in  the  context  of  the  2-D  representation  of 
the  channel  output  vector  process  is  discussed  in  Section  7.0, 
including  the  2-D  PAMF  formulation  and  the  2DARLS  identification 
algorithm.  A  summary  and  suggestions  for  further  work  are 
presented  in  Section  8.0.  Appendix  A  summarizes  the  LS 
identification  algorithm  for  multichannel  AR  processes,  and 
presents  computational  options  for  its  software  implementation  in 
the  context  of  the  PAMF  for  STAP  in  airborne  surveillance  phased 
array  radar  systems.  These  procedures  and  software  options  apply 
to  the  2DARLS  with  straightforward  modifications. 
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2.0  GENERIC  BLOCK  STAR  AND  DETECTION  ARCHITECTURE 


A  Standard  airborne  surveillance  radar  system  scenario  and 
side-looking  linear  array  configuration  with  typical  STAP 
parameters  is  assumed  herein.  As  stated  earlier,  the  number  of 
channels  is  J,  the  number  of  pulses  in  a  CPI  is  N,  and  {x(n)|n  =  0,1,. 
.  .  ,  N-1}  is  the  received  vector  sequence  for  the  range  bin  to  be 
processed  to  attain  a  detection  decision.  This  data  set  is 
referred  to  as  the  primary  data.  Also  of  relevance  is  the 
secondary  data  set,  which  is  a  collection  of  channel  output 
sequences  for  K  range  bins  in  the  neighborhood  of  the  primary 
data.  In  the  context  of  secondary  data  sets,  a  "neighborhood"  is 
defined  in  many  ways,  and  any  standard  approach  suffices  for  our 
purposes  provided  that  the  secondary  data  set  is  selected  to  be 
representative  of  the  null  hypothesis  condition  (target  not 
present)  and  statistically- independent  of  the  primary  data.  This 
secondary  data  set  is  denoted  herein  as  {>^(n|Ho)  I  n  =  0,  1, . . . ,  N-1;  k  =  1, 
For  notational  simplicity,  it  is  assumed  herein  that  N 
is  an  even  number.  This  is  hardly  a  restriction  in  the  practical 
sense  since  in  most  radar  systems  N  is  selected  to  be  a  power  of 
two. 


The  generic  STAP  architecture  proposed  herein  is  presented  in 
Figure  2-1,  and  the  delay  stack  and  advance  stack  blocks  that 
appear  in  this  figure  are  defined  in  Figures  2-2  and  2-3, 
respectively.  The  steering  vector  sequence  {e(n)  |  n  =  0,  1,  .  .  .  ,  N-1}  is 
formed  by  un-stacking  the  so-called  JN-element  steering  vector  e 
into  N  J-element  vectors  (this  un-stacking  is  the  inverse  of  an 
Nth  order  advance  stack  operator) .  That  is. 


(2-la) 


e(0) 

e(1) 

e(N-1) 
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(2-lb) 


n  =  0, 1 , . . . ,  N-1 


where  f(j  and  fg  denote  normalized  Doppler  and  spatial  frequencies, 
respectively. 

With  respect  to  Figure  2-1,  stacked  (or  block)  vectors  X2>p(n), 
e^p(n),  and  :Q(n),  and  By-  :q(")  are  defined  as  indicated  in  Figures 
2-2  and  2-3,  respectively.  That  is,  X2>p(h)  and  62>p(n)  are  JP- 
element  vectors,  and  :Q(n)  and  ©j.Q  (n)  are  JQ-  element  vectors . 
Subscript  T  denotes  that  vector  Xjp.p(n)  represents  the  past  of  the 

process  {x(n)},  with  respect  to  time  instant  n,  and  subscript  ^ 
denotes  that  vector  X^  :q(")  represents  the  future  of  the  process 

{x(n)},  with  respect  to  time  instant  n.  Data  item  x(n)  can  be 
included  either  in  the  past  or  in  the  future  (as  selected  herein) , 
without  loss  of  generality.  Of  course,  these  considerations  apply 
equally  for  the  steering  sequence,  {e(n)} .  For  simplicity,  the 
discrete- time  argument,  D,  is  dropped  from  the  past  and  future 
block  vectors  in  instances  where  the  intended  meaning  is  clear. 

Integers  P  and  Q  have  specific  meaning  in  each  weight 
calculation  algorithm,  as  discussed  in  the  sections  that  follow. 
In  order  to  simplify  notation,  and  without  loss  of  generality,  it 
is  assumed  herein  that 

(2-2)  P>Q 

Also,  the  following  constraint  must  be  satisfied  by  the  integers  P 
and  Q: 

(2-3)  P  +  Q<N 
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This  constraint  states  that  the  number  of  blocks  in  the  data  block 
vectors  is,  at  most,  equal  to  the  number  of  available  data 
sequence  elements.  Equality  is  required,  for  example,  in  the  LSPT 
algorithm  (Guerci  and  Feria,  1996)  ;  however,  in  the  OP  and  MC 
algorithms  high  levels  of  dimensionality  reduction  are  attained 
with  P  +  Q  «  N  . 


The  data-based  block  vectors  have  block  covariances  that  are 
defined  in  terms  of  the  elements  of  the  matrix  auto-covariance 
sequence  (ACS)  .  Let  Rxx(m)  denote  the  mth  lag  of  the  data  ACS 
under  the  null  hypothesis,  defined  as 

(2-4)  R„(nn)  =  E[x(n|H„)x”(n-m|H„)] 


Unless  stated  otherwise,  hereinafter  all  covariance  matrices  are 
assumed  to  be  defined  for  the  null  hypothesis  condition,  and  all 
explicit  notational  indications  of  the  null  hypothesis  are 
omitted.  Then,  the  block  covariance  matrices  are  defined  as 


(2-5) 


^2>:P,P“  ^[-rP:P  -^:P 


R»(0)  Rxx(i) 

R«(1)  Rxx(O) 

R«(P-1)  R!i(P-2) 


Rxx(P-l) 

Rxx(P-2) 

Rxx(O)  , 


(2-6) 


'  Rxx(O) 

R«(1)  • 

1 

o 

DC 

b 

b 

1! 

m 

'ix  ' 

b 

IX 

b 

II 

Rxx(1) 

Rxx(O)  ■ 

CM 

1 

o  ••• 

DC 

Rxx(Q-i) 

Rxx(Q-2)  ■■ 

■  Rxx(O) 
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■  Rxx(1) 

Rxx(2)  •• 

Rxx(P) 

(2-7) 

^7:Q,(P:P 

Rxx(2) 

Rxx(3)  • 

Rxx(P  +  1) 

Rxx(Q) 

Rxx(Q  +  1)  • 

.  R„(P  +  Q-1) 

In  these  definitions,  the  Zapf  Chancery  font  size  14  is  used  for 
the  data  block  covariances,  and  the  subscripts  denote  whether  the 
matrix  is  a  past  block  covariance  (T)  ,  or  a  future  block 
covariance  (jF),  or  a  future- to-past  block  cross-covariance 
The  subscripts  (P  and  Q)  also  denote  the  number  of  block  rows 

(subscript  preceding  the  comma)  and  the  number  of  block  columns 
(subscript  following  the  comma).  Block  covariances  p  and 

^Kj.qq  are  square,  Hermitian,  as  well  as  block  Hermitian  and  block 

Toeplitz  (a  block  Toeplitz  matrix  is  a  matrix  in  which  the  (i,j)th 
block  element  is  a  function  of  i-j)  .  Block  covariance  ^.p  is 

rectangular  (unless  P  =  Q)  as  well  as  block  Hankel  (a  block  Hankel 

matrix  is  a  matrix  in  which  the  (i,j)th  block  element  is  a  function 
of  i+j)  .  y  p  i®  also  block  symmetric  when  P  =  Q. 

In  the  most  general  version,  weight  matrices  V  and  W  in 
Figure  2-1  are  dimensioned  JQxL  and  JPxL,  respectively,  and  both 
the  data  residual  vector  e(n)  and  the  steering  residual  vector  JJ(n) 
are  L-element  column  vectors,  with  L<JQ.  Weight  matrices  V  and 
W  are  generated  by  each  of  the  STAP  algorithms  considered  herein 
as  the  solution  to  distinct  optimality  criteria.  From  Figure  2-1, 
the  data  and  steering  residual  vector  sequences  are  given  as 


(2-8) 

e(n)  =  W^x^.Q(n)  -  V%p(n)  =  a(n)  -  £(n) 

n  =  P,  P+1,.. 

.  ,N-Q 

(2-9) 

u(n)  =  W^e^.Q(n)  -  V^e2,p(n) 

n  =  P,  P+1, . . 

.  ,N-Q 
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respectively.  The  first  available  residual  is  at  time  instant  P 

because  P>Q,  and  the  last  available  residual  is  at  time  instant 
N-Q  because  the  future  block  vector,  x^.q  (n),  has  Q  block  elements. 

Vector  sequences  {ot(n)}  and  {^(n)}  are  intermediate  variables  that 

have  meaning  only  for  some  algorithms. 

Weight  matrix  C  in  Figure  2-1  is  dimensioned  LxL,  and  it  is 
required  in  some  algorithms  in  order  to  normalize  and  de-correlate 
the  residual  vector  e(n)  so  that  its  covariance  matrix  is  equal  to 
an  identity  matrix.  Then,  the  normalized  (unit  variance)  and 
uncorrelated  (both  spatially  and  temporally)  data  residual  vector 
sequence  and  the  fully-processed  steering  residual  vector  sequence 
are  given  as 

(2-10)  y(n)  =  c'^e(n)  n  =  P,  P+1, ...,  N-Q 

(2-11)  s(n)  =  C%(n)  n  =  P,  P+1,...,  N-Q 

respectively.  Weight  matrix  C  is  determined  such  that  the 
following  condition  is  satisfied: 

(2-12)  C'^RJ0)C  =  C^QC  =  II 

where  Ree(O)  and  Q  denote  the  covariance  matrix  of  the  residual  e(n), 
(2-l3a)  Q  =  Ree(0)  =  E[e(n)eH(n)] 

(2-i3b)  a  = 

The  closed  form  expression  in  Relation  (2-13b)  is  very  useful. 
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Figure  2-1.  Generic  STAP  and  detection  architecture. 


X2,.p(n) 


Figure  2-2.  Pth  order  delay  stack  definition. 


Figure  2-3.  Qth  order  advance  stack  definition. 


L6 


A  weight  matrix  C  that  satisfies  Relation  (2-12)  can  be 
determined  from  the  factors  in  a  decomposition  of  the  covariance 
Q.  Notice  that  Relation  (2-12)  implies  Q  is  non-singular.  This 
is  a  reasonable  requirement  because  a  singular  Q  is  indicative  of 
an  ill-posed  problem.  Now  consider  any  factorization  of  Q  into 
the  product  of  a  matrix  B  and  its  Hermitian  transpose.  That  is, 

(2-14)  0  =  BB^ 

Then,  it  is  straightforward  to  show  that  Relation  (2-12)  is 
satisfied  with  a  weight  matrix  C  determined  as 

(2-15)  C^  =  B"^ 

The  singular  value  decomposition  (SVD) ,  the  LDL  decomposition,  the 
Cholesky  factorization,  or  any  other  such  factorization  can  be 
used  to  generate  the  matrix  factor  B.  Each  matrix  factor  type  has 
unique  analytical  and  numerical  properties. 

Each  of  the  two  sequences  {Y(n)}  and  {s(n)}  is  stacked  in  block 

vector  form  according  to  the  convention  in  Figure  2-3  to  generate 
GL  -element  block  column  vectors  y^  and  S^,  respectively.  Integer 

G  denotes  the  number  of  elements  in  the  residual  vector  sequence; 
that  is, 

(2-16)  G  =  N  -  P  -  Q  +  1 

where  the  nominal  N-element  input  data  sequence  is  assumed. 

Parameters  P  and  Q  are  key  to  the  definition  of  each 
algorithm,  and  can  be  viewed  as  analogous  to  model  order  in 
parametric  models  (Michels  et  al .  ,  1998;  Roman  et  al .  ,  1997, 

1998) .  That  is,  the  specific  values  selected  for  these  parameters 
establish  the  particular  structure  of  each  algorithm 


17 


configuration,  and  thus  impact  performance  directly.  In  the  OP 
and  MC  STAP  algorithms  considered  herein,  parameter  L  (which 
denotes  the  number  of  elements  in  the  residual  vectors)  also  plays 
a  similar  role,  whereas  for  the  MF  algorithm  it  is  specified  as 
JQ.  For  the  OP  and  MC  algorithms,  the  selection  of  parameter  L 
directly  impacts  detection  performance  as  well  as  computational 
requirements  (via  reductions  in  dimensionality) . 

The  detection  test  statistic  block  in  Figure  2-1  is  defined 
(for  the  Gaussian-distributed  data  case  considered  herein)  as 


(2-17) 


sH  V 
-G  -G 

S 

-G  -G 


N-Q 

]£sH(n)v(n) 

n=P 

N-Q 

^sH(n)s(n) 

n=P 


This  test  statistic  is  of  the  same  form  as  the  one  proposed  by 
Rangaswamy  and  Michels  (1997)  for  the  PAMF.  However,  the  PAMF 
test  statistic  differs  in  two  important  aspects:  first,  the  PAMF 
residuals  are  J-element  vectors,  and  second,  the  PAMF  residual 
sequence  is  of  duration  N  -  P|^|  (where  P|\/||  is  a  function  of  the  model 
identification  algorithm  used  in  conjunction  with  the  PAMF) .  Test 
statistic  A  is  compared  with  a  threshold  Tq  in  order  to  select 
between  the  null  and  alternative  hypotheses.  The  threshold  is 
calculated  such  that  a  pre-determined  false  alarm  rate  is  met, 
following  the  Neyman- Pearson  (NP)  detection  criterion  (the  NP 
criterion  is  to  maximize  the  probability  of  detection  at  a  fixed 
probability  of  false  alarm) . 

2.1  Constant  False  Alarm  Rate  (CFAR)  Issues 


A  pre-determined  threshold  can  be  used  for  a  detector  with 
the  constant  false  alarm  rate  (CFAR)  property,  wherein  detection 
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performance  is  independent  of  the  covariance  matrix  of  the  total 
disturbance  (clutter,  jamming,  and  receiver  noise) .  In  theory, 
only  a  few  select  detectors  have  the  CFAR  property.  One  such 
detector  is  the  MF,  with  a  detection  rule  similar  in  form  to  the 
middle  expression  in  Equation  (2-17)  .  This  similarity  suggests 
that  some  of  the  detectors  discussed  herein  that  use  Equation  (2- 
17)  may  have  the  CFAR  property,  or  exhibit  CFAR- like  performance 
over  a  range  of  conditions.  This  point  of  view  is  supported  by 
the  argument  developed  next . 

Consider  the  term  I S*^  y  I  (the  numerator  of  Equation  (2-17)) 

I  "G  Gl 

under  the  null  hypothesis,  and  assume  that  the  data  residual 
vector  sequence  {£(n)}  is  temporally  uncorrelated.  Then,  given 
Equations  (2-10)  through  (2-16),  the  expected  value  of  the 
numerator  of  the  test  statistic  A  is 


(2-18a) 
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where  C  denotes  an  appropriate  estimate  of  the  weight  matrix  C. 
The  non-diagonal  block  elements  in  the  right-hand-side  of  Equation 
(2-18b)  are  zero-valued  LxL  matrices  because  the  data  residual 
sequence  {e(n)}  is  assumed  to  be  uncorrelated  in  time.  If  C  is  a 
suf f iciently-accurate  estimate  of  C,  then  Equation  (2-12)  implies 
that  each  block  diagonal  element  of  the  matrix  on  the  right -hand- 
side  of  Equation  (2-18b)  approximates  an  LxL  identity  matrix. 
That  is, 
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(2-19) 


\^  =  C^QC 


where  l|_  denotes  an  estimate  of  the  LxL  identity  matrix.  Then  the 
block  covariance  of  the  data  residual  block  vector  is  an 

estimate  of  the  GLxGL  identity  matrix,  and  the  expectation  of  the 
numerator  of  the  detection  test  statistic  becomes 


(2-20) 


S*^  V 

2“ 

-G  -G 

where  Iq|_  is  an  estimate  of  the  GLxGL  identity  matrix.  The 

rightmost  term  in  Equation  (2-20)  is  the  same  as  the  denominator 
of  the  detection  test  statistic,  A.  Thus,  the  expected  value  of 
the  detection  test  statistic  is  unity,  irrespective  of  the 
structure  of  the  covariance  matrix  of  the  total  disturbance.  Such 
is  the  requisite  condition  for  GEAR  performance. 


In  adaptive  implementations  of  the  STAP  methods  (wherein  all 
covariance  matrices  are  substituted  by  their  estimates) ,  the  key 
assumptions  in  the  above -out lined  argument  are  that:  a)  the  data 
residual  vector  sequence  {e(n)}  is  uncorrelated,  and  b)  C  is  a 
suf f iciently-accurate  estimate  of  C .  In  addition,  GEAR  requires 
that  the  temporal  whitening  of  the  residual  and  the  estimate  of 
weight  matrix  C  be  attained  with  estimator  structures  that  are 
independent  of  the  disturbance  covariance  matrix.  The  OP  and  MG 
STAP  algorithms  discussed  herein  have  the  potential  for  generating 
an  uncorrelated  data  residual  vector  as  well  as  an  accurate 
estimate  of  the  weight  matrix  (see  Remark  2.2.4  in  Section  2.2) . 
However,  the  requirement  of  estimator  independence  from  the 
disturbance  covariance  matrix  is  difficult  to  meet.  In  addition, 
the  theoretical  assessment  of  that  issue  is  difficult  to  attain 
also.  Nevertheless,  the  OP  and  MG  STAP  methods  and  associated 
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detector  rules  have  the  potential  for  CFAR-like  performance  at 
least  over  a  range  of  disturbance  conditions . 

2 . 2  Remarks 


The  remarks  below  address  various  relevant  issues,  including 
the  preferred  approach  for  generating  the  data  residual  covariance 
matrix,  and  variations  of  the  generic  STAP  architecture. 

Remark  2.2,1.  A  variation  to  the  structure  defined  in  Equations 
(2-8)  and  (2-9)  consists  of  having  the  input  data  sequence  of 
duration  N+P+Q,  wherein  P  additional  sequence  elements  can  be 
viewed  as  being  "negative  time"  sequence  elements,  and  Q  sequence 
elements  are  added  as  "positive  time"  sequence  elements. 
Specifically,  the  data  sequence  is  of  the  form  {x(n)  | -P,  .  .  .  , -1 , 0,  1 ,  .  .  . 
,  N-1,  N,  N+1, .  .  .  N+Q-1},  and  similarly  for  the  steering  sequence.  In 
such  a  structure  the  residual  sequences  are  both  of  duration  N, 
with  initial  time  0.  The  key  distinction  in  this  variation  is 
that  it  utilizes  more  input  data.  That  difference  involves 
practical  implications  that  must  be  taken  into  account,  specially 
in  comparative  performance  evaluations.  The  case  where  the  input 
data  sequence  is  of  duration  N  is  the  baseline  in  this  work. 

Remark  2.2.2.  The  architecture  in  Figure  2-1  can  be  generalized 
further  by  allowing  the  stacking  of  the  primary  data  into  two 

column  vectors  of  any  possible  combination  of  number  of  elements, 
rather  than  as  discrete  multiples  of  J.  For  example,  X^.q  can  be 

selected  to  be  a  scalar,  irrespective  of  the  value  of  J,  and 
likewise,  can  be  selected  to  be  a  column  vector  such  that  its 

number  of  elements  is  in  the  range  {1,2,...,  J(N-1)}.  This 
generalization  manifests  itself  in  some  of  the  algorithms  as 
allowing  the  number  of  elements  in  the  residual  vectors  to  be 
selected  among  the  allowable  range  of  discrete  values,  rather  than 
as  a  multiple  of  J.  Such  generalization  may  offer  advantages  in 
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some  applications,  but  the  inherent  structure  of  the  space-time 
data  appears  to  favor  the  partition  modulo  J.  Nevertheless,  this 
option  has  attractive  features  and  should  be  investigated  further. 

The  selection  of  X^.q  as  a  scalar,  and  of  X^p  as  a  J(N-1)- 

element  column  vector  is  a  generalization  of  the  reduced-dimension 
GSC  of  Goldstein  and  Reed  (1997)  .  One  distinction  is  that  the  GSC 
structure  includes  a  blocking  matrix,  whereas  the  architecture  in 
Figure  2-1  does  not. 

Remark  2.2.3.  Another  generalization  to  the  architecture 
presented  herein  is  to  utilize,  in  an  ad  hoc  manner,  other 
expressions  for  the  detection  rule.  Two  such  candidates  are  the 
adaptive  coherence  (AC)  proposed  independently  by  Scharf  and 
McWhorter  (1996)  and  by  Conte  et  al.  (1995;  1996),  and  the 
generalized  likelihood  ratio  test  (GLRT)  proposed  by  Kelly  (1986) . 

Remark  2.2.4.  In  practical  situations  wherein  the  true  covariance 
n  is  unknown,  an  estimate  must  be  utilized  to  generate  weight 
matrix  C.  The  OP  and  MC  STAP  algorithms  include  formulas  to 
estimate  Q,  and  such  an  estimate  is  referred  to  herein  as  the 
model  residual  covariance.  Alternatively,  a  maximum  likelihood 
(ML)  estimate  can  be  generated  as  an  ensemble  average  over  the 
residual  sequences  of  the  secondary  data  set.  The  ML  estimate  is 
referred  to  herein  as  the  sample  residual  covariance.  Simulation- 
based  analyses  reported  elsewhere  (Roman,  1998b)  for  the  PAMF  in 
the  context  of  airborne  surveillance  phased  array  radar  systems, 
indicate  that  using  the  sample  residual  covariance  (instead  of  the 
model  residual  covariance)  to  design  the  weight  matrix  C  leads  to 
reduced  threshold  variability  and  higher  probability  of  detection. 
It  is  reasonable  to  expect  that  such  performance  extends  to  the  OP 
and  MC  methods;  however,  appropriate  simulation-based  analyses 
should  be  carried  out . 
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3.0  MATCHED  FILTER  (MF) 


The  MF  detection  algorithm,  proposed  independently  by  Cai  and 
Wang  (1990)  ,  Chen  and  Reed  (1991)  ,  and  Robey  et  al .  (19  92)  ,  is 

obtained  via  maximization  of  the  generalized  likelihood  assuming 
the  total  disturbance  covariance  is  known.  Its  adaptive 
variation,  the  AMF,  is  obtained  by  using  the  ML  estimate  in  place 
of  the  unknown  covariance.  Both  methods,  MF  and  AMF,  have  the 
CFAR  property  for  Gaussian-distributed  disturbance. 

The  MF  algorithm  fits  into  the  architecture  in  Figure  2-1  as 
follows.  For  the  MF,  the  key  integers  are  Q  and  P,  which  assume 
the  following  values: 

(3-1)  Q  =  N 

(3-2)  P  =  0 

In  this  algorithm  integer  L  is  determined  directly  by  J  and  Q  as 
(3-3)  L  =  JQ  =  JN 

Also,  given  P  and  Q,  it  follows  from  Equations  (2-16)  and  (3-1) - 
(3-2) ,  and  from  Figure  2-1,  that 

(3-4)  G  =  1 

(3-5)  V  =  [0] 

Weight  matrix  W  is  dimensioned  JNxJN,  and  is  obtained  as 
(3-6) 
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-1/2 

where  denotes  the  matrix  inverse  square-root  of  the  block 

covariance  of  the  total  disturbance.  For  a  positive-definite 
covariance,  the  matrix  square- root  is  unique,  and  can  be  generated 
via  the  SVD .  However,  most  practical  implementations  would  avoid 
calculation  of  the  matrix  inverse  square-root. 

Consider  now  the  data  and  steering  sequence  residuals.  For 
this  algorithm  each  of  these  two  is  a  single  JN-element  vector 
(not  a  sequence) ;  specifically, 

(3-v)  £  = 

(3-8)  u  = 


And  the  data  residual  covariance  under  the  null  hypothesis  is 
given  as 

(3-9)  n  =  ij„ 


since  W  is  the  whitening  filter  for  the  JN-element  data  vector 
X|^.  It  follows  from  Equation  (2-12)  that  weight  matrix  C  is  also 

the  JNxJN  identity. 


(3-10) 


'JN 


Finally,  the  test  statistic  for  this  algorithm  attains  the  form 


(3-11) 


From  Equations  (3-7)  through  (3-11) ,  it  is  clear  that  the  "matched 
filter"  nomenclature  is  due  to  the  inner  product  in  the  numerator 
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of  Equation  (3-11).  The  denominator  is  a  normalization  factor 
which  gives  the  CFAR  property  for  Gau  s  s  i  an  -  d  i  s  t  r  i  bu  t  e  d 
disturbance . 


In  the  AMF,  block  covariance  ^  is  replaced  by  its  ML 

estimate,  denoted  as  •  Block  covariance  is  referred 

to  as  the  sample  covariance  matrix,  and  in  most  typical  cases  it 
is  generated  as  an  ensemble  average  over  the  secondary  data 
{recall  that  the  secondary  data  is  assumed  to  be  representative  of 
the  null  hypothesis) .  The  data  residual  covariance  under  the  null 
hypothesis  is  then  of  the  form 


(3-10) 


^AMF  ^y;N,N  ^y;N,N 


^J:N,N 


Notice  that  approaches  the  identity  matrix  as  the  sample 

covariance  matrix  approaches  the  true  covariance. 
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4.0  ORTHOGONAL  PROJECTION  (OP) 

In  this  algorithm  the  whitening  of  the  data  residual  sequence 
is  sought  by  determining  the  orthogonal  projection  of  the  space 
spanned  by  denoted  as  ,  onto  the  space  spanned  by  X^^p/ 

denoted  as  X~  .  Since  the  data  is  zero-mean  and  Gaussian- 
distributed,  an  orthogonal  projection  in  these  vector  spaces  is 
equivalent  to  conditional  expectation.  Let  X^.Q(n|X~)  denote  the 
expectation  of  :Q(n)  conditioned  on  Xj,.p(n)  .  For  appropriately- 

selected  integers  P  and  Q,  this  conditional  expectation  is  of  the 
form 

(4- la)  x^.Q(n|X“)  =  E[x^.Qx5.p](E[xj,.pxJ.p]) 

(4 -lb)  x^.Q(n|X")  =  X2,.p(n) 

where  Definitions  (2-5)  and  (2-7)  have  been  invoked.  Given  the 
orthogonal  projection  (4-1) ,  an  uncorrelated  residual  sequence  can 
be  generated  based  on  the  architecture  in  Figure  2-1. 

For  this  algorithm  the  key  parameters  are  P  and  Q  also. 
These  parameters  can  assume  a  range  of  integer  values,  and  each 
pair  of  values  results  in  an  algorithm  with  unique  performance 
characteristics.  Thus,  it  is  more  appropriate  to  view  this 
algorithm  as  a  class,  rather  than  a  single  algorithm.  Class 
members  are  defined  by  the  specific  integer  values  adopted  for  the 
key  parameters  in  the  formulation,  and  those  values  must  satisfy 
the  following  conditions: 

(4-2)  1<Q<P<N-1 

(4-3)  P  +  Q<N 
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In  this  algorithm  parameter  L  is  determined  directly  by  J  and  Q  as 


(4-4)  L  =  JQ 

Since  integer  G  satisfies  Equation  (2-16)  for  any  valid  P  and  Q 
pair,  taking  into  consideration  Equations  (4-2)  and  (4-3)  together 
with  Equation  (2-16)  leads  to  the  following  permissible  range  of 
values  for  G  (as  a  function  of  P  and  Q)  : 


(4-5)  1  <  G  <  N  -  1 

Matrix  W  is  the  JQxJQ  identity, 

(4-6)  W=ljQ 

And  the  JPxJQ  matrix  V  and  the  JQxJQ  matrix  C  are  given  as 


(4-7) 

=  ^R 

V:Q.!P:P 

(4-8) 

T 

CD 

II 

X 

O 

respectively,  where  matrix  B  is  obtained  via  the  factorization  of 
the  data  residual  covariance  matrix,  Q,  as  in  Equation  (2-14) . 
Expression  (4-7)  for  weight  matrix  V  follows  from  Equation  (4-1) 
and  Figure  2-1. 

For  this  algorithm  the  JQ-element  data  and  steering  residuals 

are 

(4-9)  e(n)  =  x^.Q(n)-x^.Q(n|X”)  =  x^.Q(n)-  x^pCn)  n  =  P, . . . ,  N-Q 

(4-10)  u(n)  =  e^.Q(n)-!?(.^.Q^.p  !^'^ppey.p(n)  n  =  P,  ...,N-Q 
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respectively.  Also,  the  normalized  and  uncorrelated  (temporally 
as  well  as  spatially)  JQ-element  data  and  steering  residuals  are 


(4-11)  y(n)  =  c'^e(n) 

(4-12)  s(n)  =  c'^u(n) 

From  Equations  (2-13b) ,  (4-6)  ,  and  (4-7) 

(  4  -  13  )  Q=  ^yr.Q  Q  -  ^y;Q_-p;P  ^<P;P,P 


n  =  P,  P+1,...,N-Q 
n  =  P,  P+1,...,N-Q 


,  it  follows  that 

^y:Q,Q  ^J:Q,TP 


Matrix  Q  generated  according  to  Equation  (4-13)  is  the  model 
residual  covariance  referred  to  in  Remark  2.2.4. 

The  detection  test  statistic  for  this  algorithm  attains  the 
form  given  in  Equation  (2-17)  ,  wherein  the  block  form  of  the 
fully-processed  data  and  steering  vector  sequences,  {y(n)}  and  {s(n)}, 
respectively,  is  used  also. 

In  its  most  general  form  (when  Q<P«N)  the  OP  method  is  a 
hybrid  processing  method,  since  it  has  both  block  and  sequential 
processing  aspects.  That  is,  the  matrix  weight  V  is  generated  and 
applied  to  the  array  output  in  a  block  mode,  and  this  is  repeated 
several  times  to  generate  a  sequence  of  residual  vectors. 

One  advantage  of  this  configuration  in  relation  to  the  MF  is 
that  the  dimensionality  of  the  required  inverse  operation  is 
reduced  from  JN  for  the  MF  to  JP  for  the  OP.  This  presents  a 
major  reduction  in  the  computational  burden,  although  several 
matrix  operations  are  required  besides  the  JPxJP  matrix  inverse. 
Such  operations  include  matrix-matrix  products  and  matrix-vector 
products  involving  JQxJP  and  JPxJP  matrices  and  JQ-element 
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vectors,  one  matrix  addition  of  JPxJP  matrices,  and  two  additions 
of  JQ-element  vectors. 

In  the  context  of  the  radar  problem  considered  herein,  an 
adaptive  OP  (AOP)  algorithm  is  defined  by  using  ensemble  averages 
(ML  estimates)  over  the  secondary  data  in  place  of  unavailable 
block  covariance  matrices.  Analogously,  the  ML  estimate  of  the 
residual  covariance  matrix,  Q,  is  generated  as  an  ensemble  average 
over  the  residual  sequences  of  the  secondary  data  set.  The  ML 
estimate  is  preferred  over  the  model  estimate,  Equation  (4-13), 
since  it  yields  better  detection  performance  results  in  other 
contexts  (Roman,  1998b) . 

In  the  adaptive  context,  dimensionality  reduction  induces  a 
reduction  in  the  number  of  elements  required  for  the  secondary 
data  set.  This  is  of  relevance  in  practical  cases  where  the 
scenario  places  constraints  on  the  selection  of  the  secondary 
data,  which  is  often  the  case  for  typical  N  and  J  values.  In 
addition,  due  to  such  dimensionality  reduction  the  AOP  can  exhibit 
better  detection  performance  than  the  AMF  using  the  same  secondary 
data,  or  equivalent  performance  using  a  smaller  set  of  secondary 
data.  This  is  due  to  the  fact  that  the  accuracy  of  the  ML 
estimate  of  a  covariance  matrix  increases  as  the  number  of 
secondary  data  used  to  generate  the  estimate  increases. 

As  stated  earlier,  each  specific  choice  of  key  integers  leads 
to  a  distinct  algorithm.  Two  such  cases  of  interest  are  presented 
in  Sections  4.1  and  4.2  under  known- covariance  conditions  (for 
simplicity) .  The  selected  cases  represent  practical  boundaries 
for  algorithm  configuration  options:  fully-sequential  and  fully- 
block. 
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4 . 1  Parametric  Adaptive  Matched  Filter  With  Yule-Walker  (PAMF-YW) 

And  Levins on -Durbin  (PAMF-LD)  Algorithms 

The  PAMF  for  Gaussian-distributed  processes  has  been 
formulated  by  Michels  et  al .  (1998)  using  a  sequential  whitening 

filter  based  on  AR  models,  and  using  the  Strand-Nuttall  (SN)  model 
identification  algorithm  to  estimate  the  AR  parameters.  This 
detection  approach  is  referred  to  as  the  PAMF-SN.  A  related 
detection  approach  is  to  utilize  the  Yule-Walker  (YW)  algorithm  to 
estimate  the  AR  parameters  required  in  an  AR-based  PAMF.  Such  a 
detection  approach  is  referred  to  herein  as  the  PAMF-YW.  It  is 
demonstrated  herein  that  the  PAMF-YW  results  from  a  particular 
selection  of  the  key  parameters  P  and  Q  in  the  OP  approach.  In 
addition,  it  is  demonstrated  that  the  multichannel  Levinson-Durbin 
(LD)  algorithm  for  solving  the  augmented  YW  equations  leads  to  the 
PAMF-LD  detection  approach.  Thus,  both  the  PAMF-LD  and  the  PAMF- 
YW  are  implemented  with  the  detection  architecture  in  Figure  2-1. 

Equations  (4-2)  and  (4-3)  present  the  allowable  range  of 
values  for  the  P  and  Q  parameters  in  the  OP  algorithm.  For  the 
present  case,  select  P  and  Q  as 

(4-14)  Q=1 

(4-15)  1  <  P  <  N-1 

As  shown  below,  in  the  present  context  P  is  the  order  of  the  AR 
system  selected  to  represent  the  input  data.  Its  specific  value 
is  selected  based  on  performance  considerations.  Given  these 
values  for  P  and  Q,  it  follows  from  Equations  (4-4) -(4-6)  that 

(4-16)  L  =  J 

(4-17)  G  =  N-P 
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(4-18) 


Weight  matrices  V  and  C  and  covariance  matrix  Q  are  defined  next. 

The  equivalence  between  the  OP  method  and  the  PAMF-YW  rests 
on  an  alternative  interpretation  of  the  role  of  weight  matrix  V. 
Weight  matrix  V  is  JPxJ  in  this  case,  and  is  obtained  using 
Equation  (4-7)  for  Q  =  1 ;  namely, 

(4-19)  = 

An  algebraically-equivalent  expression  is  obtained  by  applying  the 
Hermitian  transpose  to  both  sides,  and  then  pre-multiplying  both 
sides  by  : 

(4-20)  ^2>:PP  V  = 

Now  let  the  unknown  matrix  V  be  a  partitioned  matrix  of  the  form 

'Ai 

Ap 

(4-21)  V  =  -  . 

-Ap 

where  each  matrix  in  the  set  {A(^  |  k  =  1 ,  2,  .  .  .  ,  P}  is  JxJ .  Given  this 
interpretation  of  weight  matrix  V,  it  is  straightforward  to 
recognize  that  Equation  (4-20)  is  the  well-known  Yule-Walker 
equation  of  time  series  analysis.  It  is  easy  to  recognize  also 
that  the  matrices  {AJ  are  the  matrix  parameters  of  a  multichannel 

AR  system  of  the  form 
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(4-22) 


n  =  0, 


where  {w(n)}  denotes  the  temporally-uncorrelated  input  (or  driving) 
sequence  with  covariance  matrix  Q.  System  (4-22)  is  referred  to 
as  an  AR  system  of  order  P,  and  is  denoted  as  AR(P) . 


Using  Equations  (4-9),  (4-10),  and  (4-21),  the  J-element  data 

and  steering  residual  sequences  are  determined  as 


|_| 

(4-23a)  e(n)  =  x(n)-x(n|n-1)  =  x(n)- V  Xjp.p(n) 

p  p 

(4 -23b)  £(n)  =  x(n)  +  ^A[^x(n-l<)  =  Aj^x(n-k) 

k=1  k=0 

(4-24a)  u(n)  =  e(n)-V^e^,p(n) 

p  p 

(4-24b)  u(n)  =  e(n)  +  ^  A^  e(n  -  k)  =  ^  A^  e(n  -  k) 

k=1  k=0 


n  =  P,  P+1,...,N-1 

n  =  P,  P+1, . . . ,  N-1 

n  =  P,  P+1,...,  N-1 

n  =  P,  P+1,...,  N-1 


respectively.  Notice  in  Equation  (4-23a)  that  the  more  common  and 
simpler  notation  x(n|n-1)  has  replaced  X^.Q(n|X  ).  Equations  (4-23) 

and  (4-24)  are  in  the  form  of  a  multichannel  moving  average  (MA) 
system,  with  inputs  {x(n)}  and  {e(n)},  respectively,  and  outputs  {e(n)} 
and  {u(n)},  respectively.  An  equivalent  interpretation  is  that 
these  expressions  constitute  the  whitening  filter  associated  with 
the  AR  model  in  Equation  (4-22) .  This  is  a  statement  of  the  fact 
that  an  MA  system  and  an  AR  system  are  system  inverses  of  each 
other.  Next,  from  Equations  (2-6),  (2-7),  (4-13),  (4-19),  and  (4- 
21) ,  the  data  residual  covariance  Q  is  determined  as 
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(4-25) 


P 

n  =  =  Rxx(0)+X  Ak  Rxx(-i<) 

k=1 

Finally,  the  uncorrelated  and  normalized  J-element  data  and 
steering  residual  sequences  are  obtained  as  in  Equations  (4-11) 
and  (4-12)  ,  since  the  JxJ  weight  matrix  C  is  obtained  as  in 
Equation  (4-8),  with  B  a  factorization  of  Q.  as  in  Equation  (2-14), 
and  O  calculated  as  in  Equation  (4-25) . 

Concatenation  of  Equations  (4-20)  and  (4-25)  leads  to  the 
augmented  Yule-Walker  equation.  Specifically, 


1 _ 

I  X 

1  V:1,!P;P 

■  Ij' 

Q 

CL 

_ 1 

I  ^!P;pp 

-V 

.  ^JP.J_ 

Equation  (4-26)  can  be  solved  using  the  multichannel  Levinson- 
Durbin  (LD)  algorithm.  Such  an  approach  applied  to  the  detection 
context  considered  herein  thus  leads  to  the  PAMF-LD  detection 
algorithm. 

In  the  context  of  this  sub- section,  the  OP  method  is  fully- 
sequential,  and  the  application  of  the  weight  matrix  V  in  Figure 
2-1  is  viewed  as  a  recursive  filter,  instead  of  a  block  filter. 

4 . 2  Matched  MaximiJin- Dimension  Orthogonal  Projection 

Consider  the  constraints  for  P  and  Q  in  Equations  (4-2)  and 
(4-3),  and  select  for  this  case 

(4-27)  Q  =  P  =  — 

2 
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These  values  result  in  fully-block  configuration,  which  is  the 
opposite  of  the  fully-sequential  configuration  of  Section  4.1 
above.  Then,  from  Equations  (2-16)  and  (4-4) -(4-6), 


(4-29)  G  =  1 

(4-30)  ^  “  *JN/2 

These  parameter  values  indicate  that  the  data  and  steering 
residuals  sequences  consist  of  only  one  L-element  vector. 


From  Equations  (4-7),  (4-8),  and  (4-13)  with  P  =  N/2,  the  JPxJP 

weight  matrix  V,  the  JPxJP  weight  matrix  C,  and  the  JPxJP 
covariance  matrix  Q  are  obtained  as  follows, 

(4-31)  V  =  ^2>.p p 


(4-32) 


(4-33) 

(4-34) 


^  ^y;P,lP:P 

BB^  =  Q 


X 


-1 

!P:P,P 


^y:P,2>:P 


An  augmented  set  of  equations  can  be  defined  by  concatenating 
Equations  (4-31)  and  (4-32),  as  in  Section  4.1.  But,  in  contrast 
with  the  case  handled  in  Section  4.1,  the  case  considered  in  this 
section  does  not  appear  to  have  alternative  interpretations. 

The  OP  algorithm  configuration  in  this  sub- section  is  a  block 
processing  method.  That  is,  weight  matrix  V  is  applied  once  per 
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CPI  per  range  cell  (the  residual  sequence  has  one  vector  element) . 
In  this  fully-block  configuration  the  past  and  future  vectors  have 
the  same  dimensionality,  so  the  conditional  expectation  in 
Equation  (4-1)  requires  the  inverse  and  product  of  square  matrices 
of  the  same  dimensions,  as  indicated  in  Equation  (4-31)  .  The 
residual  covariance  matrix  is  generated  also  involving  only 
operations  on  square  matrices  of  the  same  dimensions,  as  indicated 
in  Equation  (4-32)  .  This  results  in  the  minimum  number  of 
computations  for  a  block  processing  method  (Guerci  and  Feria, 
1996)  . 
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5.0  MAXIMUM  CORRELATION  (MC) 


The  MC  algorithm  for  adaptive  arrays  has  been  proposed  by 
Roman  (1998a)  in  the  context  of  adaptive  sidelobe  canceling  using 
complex-valued  data,  and  is  based  on  the  concept  of  canonical 
variables  and  canonical  correlations  formulated  by  Hotelling 
(1936)  for  real-valued  random  variables.  That  methodology  is 
extended  herein  to  STAR  applications  in  general,  and  enhanced  to 
include  a  detection  test.  One  of  the  contributions  in  (Roman, 
1998a)  is  the  non-trivial  extension  of  the  canonical  variables 
concept  to  handle  the  case  of  complex-valued  data. 

Hotelling's  canonical  correlations  formulation  is  as  follows. 
Given  two  sets  of  scalar  random  variables  (or  equivalently,  two 
random  vectors)  find  a  basis  for  each  set  such  that  in  the 
transformed  basis  the  first  variable  in  each  set  is  maximally 
correlated  with  the  first  variable  in  the  other  set,  and 
uncorrelated  with  all  the  other  variables  in  the  two  sets.  And 
also  that  the  second  variable  in  each  transformed  set  is  maximally 
correlated  with  the  second  variable  in  the  other  transformed  set, 
and  uncorrelated  with  all  the  other  variables  in  the  two 
transformed  sets.  And  so  on  until  all  the  variables  in  the  two 
sets  have  been  considered.  In  addition,  each  variable  in  the  two 
transformed  sets  is  normalized  to  unit  variance.  The  variables  in 
the  transformed  basis  are  the  canonical  variables,  and  the 
normalized  and  optimized  correlations  are  the  canonical 
correlations  .  The  maximization  required  repeatedly  in  the 
procedure  is  straightforward  for  real -valued  random  variables,  but 
the  problem  formulation  requires  a  modification  for  complex-valued 
random  variables  because  the  maximum  of  a  complex-valued  variable 
is  non-unique  (Roman,  1998a) . 

In  the  STAR  application  context  as  formulated  herein,  block 
vectors  and  X^.q  are  the  vectors  to  be  transformed  into 
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canonical  variables,  and  weight  matrices  W  and  V  in  Figure  2-1 
are  the  transformation  matrices.  The  foundation  for  the  MC 
algorithm  is  that  the  residual  sequence  {e(n)}  is  uncorrelated  in 
time  because  it  is  generated  as  the  difference  between  maximally- 
correlated  random  variables . 

As  before,  integers  P  and  Q  define  the  number  of  sub-vectors 
in  the  stack  vectors  X2>p(n)  and  X^.Q(n),  respectively;  that  is,  Xy.p{n) 

is  a  JP-  element  vector,  and  X^.q  (n)  is  a  JQ-  element  vector. 

Parameters  P  and  Q  can  be  selected  to  be  within  a  wide  range  of 
values,  but  are  considered  herein  to  satisfy  the  following 
conditions  (recall  Assumption  (2-2)  and  Constraint  (2-3)): 

(5-1)  1  <  Q  <  P  <  N-1 

(5-2)  P  +  Q<N 

Recall  that  Constraint  (5-2)  is  due  to  the  fact  that  the  number  of 
data  points  is  fixed.  Selection  of  P  and  Q  is  based  on  the 
typical  trade-off  between  computational  and  performance  issues 
(see  Section  5.2  for  additional  comments).  For  example,  small¬ 
valued  P  and  Q  allow  for  dramatic  dimensionality  reduction  in 
relation  to  the  MF,  but  can  entail  a  noticeable  loss  in  detection 
performance  in  relation  to  the  full -dimension  case  (P  +  Q  =  N). 

The  MC  algorithm  includes  another  variable  parameter,  the 
dimensionality  of  the  data  and  steering  residuals,  whose  selection 
allows  for  additional  dimensionality  reduction  beyond  that 
provided  by  appropriate  selection  of  P  and  Q.  In  the  generic 
architecture  of  Figure  2-1,  weight  matrices  V  and  W  are 
dimensioned  JPxL  and  JQxL,  respectively,  and  the  data  and  steering 
residuals,  e(n)  and  u(n),  respectively,  are  L-element  column  vectors, 
where  L  is  selected  such  that 
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(5-3) 


1  <  L  <  JQ 


Selection  of  L  is  based  on  dimensionality  and  performance 
considerations.  Criteria  for  the  selection  of  L  are  based  on 
information  theory  concepts  and  utilize  intermediate  results  from 
the  algorithm  (see  Section  5.2) . 

In  summary,  there  are  two  mechanisms  for  dimensionality 
reduction  in  the  MC  algorithm:  first,  selection  of  P  and  Q  such 
that  Q<P<N  (and  P«N  can  provide  satisfactory  performance);  and 
second,  selection  of  L  such  that  L<JQ. 


Formulation  of  the  method  and  establishment  of  the  general 
solution  is  accomplished  best  by  assuming  full  dimensionality  for 
the  residual  sequences;  that  is,  with  L=JQ.  Then,  the  two  JQ- 
element  vector  sequences  {a(n)  |n  =  P,  P+1,  .  .  .  ,  N-Q}  and  {2(n)  I  n  =  P,  P+1 , .  . 
.  ,  N-Q}  in  Figure  2-1  are  defined  in  terms  of  two  intermediate 
sequences,  {i)(n)  |  n  =  P,  P+1,  .  .  .  ,  N-Q)  and  {ji(n)  |  n  =  P,  P+1,  -  .  .  ,  N-Q},  as 
follows : 


■  n/n)  ■ 

x(n) 

a)(n)  = 

D2(n) 

= 

x(n  +  1) 

1 

_ 1 

x(n  +  Q-1) 

(5 -4b)  \)(n)  =  W^x^.Q(n) 

(5-5)  a(n)  =  a)(n)  =  W^x^.Q(n) 


n  =  P,  P+1,...,  N-Q 


n  =  P,  P+1,...,  N-Q 
n  =  P,  P+1, . . . ,  N-Q 
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:  ;  rx(n-i) 

_  yJo  x(n-2) 

:  :  x(n-P) 

M^jp(n)  .  bj(p-Q) 

’ev('^)1  r  ' 

(5-6b)  |Li(n)=  -----  =  ----  X  p(n)  n  =  P,  P+1,  . . . ,  N-Q 

L  ^  J 

( 5  -  7 )  m)  =  Uy(n)  =  V%p(n)  n  =  P,  P+1 , . . . ,  N-Q 

where  each  Wj(n),  for  i  =  1 ,  2,  .  .  .  ,  JQ,  is  a  JQ-element  column  vector, 
each  yi(n),  for  i  =  1,2,...,  JQ,  is  a  JP  -element  column  vector,  and  each 
hj(n),  for  i  =  1,  2,  .  .  .  ,  J(P-Q),  is  a  JP  -element  column  vector.  In  the 
context  of  this  general  case  with  full -dimensionality  (Q  < P <  N  and 
L  =  JQ),  W  is  JQxJQ,  V  is  JPxJQ,  and  H  is  JPxJ(P-Q);  also,  all 
three  matrices  have  full  rank.  The  sequence  {iip|(n)  |  n  =  P,  P+1,  .  .  .  ,  N-Q} 

of  J(P-Q)-element  vectors  lacks  a  role  in  the  STAP  architecture 
because  it  is  selected  by  the  MC  algorithm  such  that  it  spans  a 
subspace  orthogonal  to  \)(n) . 

The  residual  vector  e(n)  is  a  JQ-element  column  vector,  and 
the  data  residual  vector  sequence  is 

(5- 8a)  s(n)  =  a(n)  -  ^(n)  =  i)(n)  -  jj^(n)  n  =  P,  P+1, ...,  N-Q 

(5-8b)  e(n)  =  W%.Q(n)  -  V%p(n)  n  =  P,  P+1, ...,  N-Q 


n  =  P,  P+1,...,  N-Q 
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As  before,  G  is  used  to  denote  the  number  of  elements  in  the 
residual  vector  sequence.  Since  P  and  Q  are  specified  only  to  lie 
within  a  range,  then 

(5-9)  G  =  N  -  P  -  Q  +  1 

Definitions  analogous  to  those  in  Equations  (5-4)  through  (5-8) 
can  be  introduced  for  the  steering  sequence  path  in  the 
architecture  of  Figure  (2-1) .  However,  the  MC  algorithm  weight 
matrices  V  and  W  are  designed  based  on  probabilistic  concepts, 
and  applied  analogously  in  the  steering  sequence  path  of  Figure  2- 
1.  Also,  notice  that  in  the  MC  approach  both  weight  matrices  V 

and  W  play  a  role.  This  is  in  contrast  with  the  MF,  wherein  V= 
[0],  and  with  the  OP,  wherein  VV  =  Ijq. 

Equations  (5-8)  suggests  that  the  data  residual  sequence  is 

uncorrelated  in  time  if  the  weight  matrices  V  and  W  (as  well  as 
matrix  H)  are  selected  such  that:  a)  the  JQ  elements  of  are 

maximally  correlated  with  the  corresponding  elements  of  i)(n)  and 
uncorrelated  with  the  others,  b)  the  J(P-Q)  elements  of  are 

uncorrelated  with  the  elements  of  2)(n),  c)  the  elements  of  l)(n)  are 
pair-wise  uncorrelated,  and  d)  the  elements  of  iL(n)  are  pair-wise 
uncorrelated.  Such  is  the  objective  of  the  MC  approach,  as 
presented  in  detail  next. 

Maximum  Correlation  Formulation  JL  =  JQ_  Configuration) .  Generate 
the  weight  matrices  V  and  W,  and  matrix  H,  such  that  at  each 
instant  of  time  n,  the  conditions  stated  next  are  true  for  the 
sequences  {l)(n)}  and  {^(n)}.  First,  select  w.|  and  such  that  1)1(11)  is 

maximally  correlated  with  |li(n),  and  both  Di(n)  and  |Xi(n)  have  unit 
variance.  Second,  select  W2  ^2  that  D2(n)  is  maximally 
correlated  with  Mn),  and  D2(n)  and  |X2(*^)  both  uncorrelated  with 
Dl(n)  and  with  |J.i(n),  and  both  D2(n)  and  |i2(n)  have  unit  variance.  The 
ith  condition  (for  2<i<JQ)  is  to  select  Wj  and  Vj  such  that  Di(n)  is 


40 


maximally  correlated  with  |ij(n),  and  'Uj(n)  and  fii(n)  are  both 
uncorrelated  with  Vi(n),  '02(’^)'  •  •  •  /  with  |ii(n),  )i2(n)/  • 

•  •  /  l^i-i(n),  and  both  v,{n)  and  li.j(n)  have  unit  variance.  The  (JQ+1)th 

condition  is  to  select  h.,  such  that  |ljQ+i(n)  has  unit  variance  and  is 
uncorrelated  with  1)1(11),  1)2(11),  .  .  .  ,  DjQ(n),  and  with  |li(n),  |l2(r>)/  • 

•  •  '  M^Jq(i^)  •  Lastly,  the  JPth  condition  is  to  select  hj^p.Q^  such 

that  |J-jp(n)  has  unit  variance  and  is  uncorrelated  with  Di(n),  1)2(11),  . 

•  •  /  'i>jQ(n),  and  with  |ii(n),  H2(n).  •  •  •  ,  l^jp-i(n).  Carrying  out  the 

required  optimization  at  each  step  (maximizing  the  complex-valued 
correlation  coefficient  as  in  [Roman,  1998a] )  leads  to  the 

following  compact  set  of  matrix  equations. 


(5-10)  E[i)(n)i)H(n)]  =  W""  E[x^^Q(n)xH  ^(n)]w  =  w""  W  =  Ijq 


(5-11)  E[|i(n)jiH(n)]  = 


V" 


X2>:p(^)x5p(n)][V  H]  = 


V 


H 


^j,:Pp[V  H]  =  l 


JP 


(5-i2a)  E[^n)jiH(n)]  =  w'^E[x^^Q(n)xHp(n)][V  H]  =  w" [ V  H] 


(5-12b)  E[^n)^H(n)]  =  R^^(0)  =  R^^  = 


Pi  0 

0  P2 

0  0 


0  0 

0  0  •• 


(5-13)  E[3)(n)M.H(n)]  =  w";^^,j^p[V  :H]  = 


PjQ 

0 

oj 

>1 

0  • 

0 

!  0  -• 
1 

0" 

0 

P2  • 

••  0 

1  0  -• 
1 

0 

O 

1 

0  • 

Rjq 

0 : 

0 

(5-14)  1>p,>p2>...>pjo>0 
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where  p.  is  the  correlation  coefficient  between  'Uj(n)  and  |ii(n),  and 

Equation  (5-13)  follows  trivially  from  Equations  (5-12) .  The  unit 
variance  conditions  (Equations  (5-10)  and  (5-11))  are  introduced 
without  loss  of  generality  because  the  correlation  coefficient 
between  two  random  variables  is  invariant  to  a  scale  factor 
applied  to  either  one  of  the  two  variables  or  to  both  of  the 
variables . 

Variables  {1)1(11)  |  i  =  1 , 2,  .  .  .  ,  JQ}  and  {Ui(n)|i  =  1,2 . JP}  constitute 

the  complete  set  of  canonical  variables,  and  the  correlation 
coefficients  {pj  |  i  =  1 ,  2,  .  .  .  ,  JQ}  are  the  canonical  correlations  for 
the  random  vectors  X^p  and  X^.q  (recall  that  Q<P  is  assumed). 


In  summary,  Equations  (5-10)  through  (5-14)  constitute  the 
relations  that  must  be  solved  for  the  unknown  canonical 
correlations  and  weight  matrices. 


Maximum  Correlation  Solution  _(L  =  JQ_  Configuration)  .  Weight 
matrices  V  and  W,  matrix  H,  and  the  canonical  correlations  {p|} 

that  satisfy  Equations  (5-10)  through  (5-14)  are  obtained  as 

follows.  Given  the  array  output  data  block  covariances  defined  in 
Equations  (2-5) -(2-7),  form  the  coherence  matrix  Rgg  as 

(5-15)  =  E[9(n)5H(n)]  = 


—1/P  —1/P 

where  e(n)  =  5?.^.Q  QX^.Q(n)  and  5(n)  =  !^^.p  p  x^.p(n)  are  dummy  variables . 

As  inferred  in  Equation  (5-15) ,  Rgg  is  the  JQxJP  cross -covariance 

between  the  two  white  random  vectors,  e(n)  and  5(n)  (both  have 

identity  covariance  matrix) .  Now  apply  the  SVD  to  the  coherence 
matrix  ^05  to  obtain  a  factorization  of  the  form 


(5-16)  R0g=TiDTj 
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where  is  a  JQxJQ  unitary  matrix,  T2  is  a  JPxJP  unitary  matrix, 
and  D  is  a  real-valued  JQxJP  matrix  with  zero-valued  elements 
everywhere  except  along  the  main  diagonal  (recall  that  P>Q),  and 
the  elements  along  the  main  diagonal  are  arranged  in  order  of 
descending  magnitude.  The  real -valued  elements  along  the  main 
diagonal  are  the  JQ  canonical  correlations  between  the  past  and 
future  vectors,  arranged  in  descending  order.  That  is. 


Ol 

0  • 

0 

1  0  • 

1 

1 - 

0 

0 

P2  • 

0 

!  0  • 

1 

•  0 

1 - 

0 

0 

Rjq 

io  .. 

- , 

0 

(5-18)  1  >p^  >P2>.  .  .>PjQ>0 


with  D.|  a  JQxJQ  diagonal  real-valued  matrix,  and  D2  a  JQxJ(P-Q) 
null  matrix.  In  the  cases  where  P  =  Q,  matrix  D  is  square  and 
diagonal.  Also,  it  follows  from  Equations  (5-12b)  and  (5-17)  that 
Run  =  D . 


Matrices  and  T2  play  a  role  that  is  obscured  in  the  above 

formulation.  These  matrices  transform  the  intermediate,  dummy 
variables  e(n)  and  6(n)  into  the  canonical  variables  i)(n)  and  ix(n), 
respectively.  Specifically,  l)(n)  =  0(n)  and  jx(n)  =  Tg  5(n) . 


Based  on  the  above  development,  weight  matrices  W  and  V  and 
matrix  H  are  obtained  as 


(5-19)  W  = 


(5-20) 
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(5-21) 


H=3?.;“T2h 

where  l2\j  ~^2H  JPxJQ  and  JPxJ(P-Q)  partitions,  respectively, 

of  the  unitary  matrix  Tg;  namely, 

(5-22)  "^2  ~  ^2,1  -2,2  ^2,JQ  1^2,10+1  '  ^2,Jp]“['^2V  '"^211] 

The  covariance  matrix  of  the  data  residual  is  generated  using 
Equation  (2-13)  and  Equations  (5-10)  through  (5-18). 

For  the  MC  method  the  residual  covariance  is  a  simple 
function  of  the  non- zero  canonical  correlations;  specifically, 

'1-p^  0  -  0  " 

0  1-P2  -  0 

(5-23)  n  =  2ljQ-2D,=2{ljo-D,)=2  ,  ,  , 

0  0  ...  1-p^^ 

where  D.|  is  a  JQxJQ  diagonal  sub-matrix  of  D  defined  in  Equation 
(5-17)  .  The  diagonal  form  of  Q  simplifies  the  final  block 
filtering  step  in  the  detection  architecture  because  weight  matrix 
C  is  then  diagonal  also  (Equations  (2-14)  and  (2-15) ) .  That  is, 

(5“24)  C  =B  =£2 

—1/2 

where  Q  is  the  inverse  square -root  matrix  of  the  data  residual 
covariance  matrix.  The  normalized  and  uncorrelated  (temporally 
and  spatially)  JQ-element  data  and  steering  residuals  for  the  MC 
algorithm  are 

(5-25)  y(n)  =  c'^e(n)  n  =  P,  P+1, . . . ,  N-Q 
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(5-26) 


n  =  P,  P+1, .  . .  ,  N-Q 


s(n)  =  C^u(n) 

The  form  of  Q  provides  insight  into  the  issue  of  dimensionality 
reduction,  and  suggests  a  criterion  for  dimensionality  reduction 
(Section  5.2). 

The  detection  test  statistic  for  this  algorithm  attains  the 
form  given  in  Equation  (2-17)  ,  wherein  the  bloclc  form  of  the 
fully-processed  data  and  steering  vector  sequences,  {y(n)}  and  {s(n)}, 
respectively,  is  used  also. 

Maximum  Correlation  Solution  (L<  JQ_  Configuration) .  Consider  now 
the  condition  where  the  data  residual  vector  dimensionality  is  L < 
JQ.  This  condition  could  be  forced  by  the  need  to  reduce  the 
computational  requirements,  or  may  be  driven  by  the  recognition, 
utilizing  some  dimensionality- reducing  measure,  that  the  effect  of 
neglecting  the  canonical  variables  beyond  L+1  is  negligible  (see 
Section  5.2)  .  Under  such  conditions  only  a  subset  of  the 
canonical  variables  is  utilized,  even  though  the  complete  set  of 
canonical  variables  and  canonical  correlations  is  generated  by  the 
algorithm.  Specifically,  for  L<JQ,  only  the  variables  {Dj(n)  [  i  =  1 , 2, 
.  .  .  ,  L}  and  {|Xj(n)  ]  i  =  1, 2,  . .  .  ,  L}  and  the  correlation  coefficients  {Pj  |  i  =  1, 

2,  .  .  .  ,  L}  are  retained.  This  implies  the  formulas  for  the 

calculation  of  the  weight  matrices  must  be  modified. 

For  the  case  L  <  JQ  the  L- element  vector  sequences  {oc(n)  |  n  =  P, 
P+1, ...  ,  N-Q}  and  {^(n)  |  n  =  P,  P+1, ... ,  N-Q)  in  Figure  2-1  are  defined  as 

''»i(n)1 

(5-27)  a(n)=  ;  =  :  x^.Q{n)  =  W^x^.Q(n)  n  =  P,  P+1, ...,  N-Q 

[wl 
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(5-28) 


P(n) 


>i(n)‘ 

17  ■ 

Xy.p(n)  =  V^x^.p(n) 


p.p\ 


n  =  P,  P+1,...,N-Q 


Also,  it  is  convenient  to  define  a  partition  in  each  one  of  the 
two  matrices  and  T2  of  the  form 


■■■  '  "*"16] 

^2,JQ  ^2,Jp]“['^2A  '"^261 

where  and  T^g  are  JQxL  and  JQxJ(Q-L)  partitions,  respectively, 
of  the  unitary  matrix  ;  also,  ~^2k  "^26  JPxL  and  JPxJ(P-L) 

partitions,  respectively,  of  the  unitary  matrix  Tg.  The  partitions 

in  Equation  (5-30)  are  distinct  from  those  in  Equation  (5-22)  . 
However,  notice  that  T2^  =  T2y  and  T2g  =  T2|^  when  L  =  JQ.  Given  these 

definitions,  weight  matrices  W  and  V  are  obtained  as 


:5-29)  = 


-1,1  -1.2 


h  !  t 


L+1 


(5  30)  '^2“[i2,1  -2.2  -2,L  !  ^2,L+1 


(5-31) 


'A.J.QQ  '1A 


(5-32)  V  = 


As  before,  the  covariance  matrix  of  the  data  residual  is  generated 
using  Equation  (2-13)  and  Equations  (5-10)  through  (5-18),  which 
results  in. 


(5-33) 


Q  =  2Il-2D^^=2(Il-D^^)=2 


I-P1 

0 


0 

1-P2 


0 

0 


0  0  -  1-Pl 
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where 

is  an  l_xL  diagonal  sub-matrix  of  ;  namely, 

(5-34a) 

D 

'  Lfo] 

(5-34b) 

Dii  =  cliag[{p,,P2 

. PlH 

(5-34c) 

D^2  =  diag[{pL+i, 

Pl+2’"-*Pjq}] 

Weight  matrix  C  is  calculated  using  Equation  (5-24),  with  £1  as  in 
Equation  (5-33).  The  detection  test  statistic  is  generated  also 
via  Equation  (2-17)  .  This  completes  the  formulas  for  L<JQ. 

The  calculations  in  the  MC  algorithm  involve  matrices  of 
dimensions  JQ  and  JP,  with  Q<P.  In  contrast,  the  ME  algorithm 
requires  the  inverse  of  a  JNxJN  Hermit ian  matrix.  This  implies  a 
dramatic  reduction  in  the  computational  requirements  for  typical 
values  of  J  and  N,  specially  if  P  «  N.  In  addition,  the 
calculations  in  the  final  block  filtering  step  (weight  matrix  C) 
are  reduced  further  when  L<JQ  is  feasible  (Section  5.2). 

In  the  radar  array  context  considered  herein,  an  adaptive  MC 
(AMC)  algorithm  is  configured  by  using  ensemble  averages  (ML 
estimates)  over  the  secondary  data  in  place  of  unavailable  block 
covariance  matrices.  Analogously,  the  ML  estimate  of  the  residual 
covariance  matrix,  Q,  is  generated  as  an  ensemble  average  over  the 
residual  sequences  of  the  secondary  data  set.  Such  ML  estimate  is 
preferred  over  the  appropriate  model  estimate,  either  Equation  (5- 
23)  or  Equation  (5-33),  since  it  yields  better  detection 
performance  results  in  other  contexts  (Roman,  1998b) . 

In  the  adaptive  context,  dimensionality  reduction  induces  a 
reduction  in  the  number  of  elements  required  for  the  secondary 
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data  set,  as  for  the  OP  algorithm.  This  is  of  relevance  in 
practical  cases  where  the  scenario  places  constraints  on  the 
selection  of  the  secondary  data,  which  is  often  the  case  for 
typical  N  and  J  values.  In  addition,  due  to  such  dimensionality 
reduction  the  AMC  can  exhibit  better  detection  performance  than 
the  AMF  using  the  same  secondary  data,  or  equivalent  performance 
using  a  smaller  set  of  secondary  data,  as  for  the  OP  algorithm. 

In  its  most  general  form  (when  Q<P«N)  the  MC  method  is  a 
hybrid  processing  method,  since  it  has  both  block  and  sequential 
processing  aspects.  Specifically,  the  matrix  weights,  W  and  V, 
are  generated  and  applied  to  the  array  output  in  a  block  mode,  and 
this  is  repeated  several  times  to  generate  a  sequence  of  residual 
vectors.  A  parametric,  fully-sequential  implementation  of  the  MC 
method  is  obtained  by  generating  a  multichannel  (mul ti - input , 
mult i -output )  state  variable  model  (SVM)  using  one  of  the  two  sets 
of  canonical  variables  as  the  state  variables.  The  SVM  matrix 
parameters  are  generated  via  operations  on  the  past  and  future 
auto-  and  cross - covariances .  Such  an  approach  has  been  formulated 
and  tested  extensively  by  SSC  for  the  case  Q  =  P,  and  is  referred 
to  as  the  PAMF  using  canonical  correlations  (CC)  ,  or  PAMF-CC 
(Roman  and  Davis,  1993b;  Roman  et  al.,  1997,  1998).  The  PAMF 
admits  several  variations  wherein  other  linear  model  types  and/or 
model  identification  algorithms  are  utilized,  as  evidenced  by  the 
developments  in  Section  4.1. 

As  with  the  OP  algorithm,  the  MC  algorithm  admits  a  wide 
variety  of  configurations  based  on  the  selection  of  the  parameters 
P  and  Q.  One  such  configuration  is  the  PAMF-CC  mentioned  above. 
Another  is  the  matched,  maximum-dimension,  fully-block  algorithm 
obtained  when  Q  =  P  =  N/2.  At  the  other  end  of  the  spectrum  is  the 
sidelobe  canceler,  wherein  JQ  =  1  and  JP  =  N-1  (Roman,  199  8a)  . 
Further  analysis  of  these  variations  remains  for  future  work. 
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5.1  Information  Theory  And  Canonical  Correlations 


Information  theory  plays  a  key  role  in  the  interpretation  of 
canonical  variables,  and  in  the  formulation  of  a  dimensionality- 
reduction  criterion  (implemented  via  determination  of  parameter 
L) .  The  basic  information  theory  concepts  referred  to  herein  are 
discussed  in  detail  in  various  texts,  including  Beckmann  (1967) 
and  Thomas  (1969)  . 


The  amount  of  information  about  a  vector  random  process 
provided  by  a  realization  of  the  process  is  referred  to  as  self - 
information .  Self -information  of  a  realization  of  a  random  vector 
X,  denoted  herein  by  3S(x),  is  defined  as 


(5-35) 


3l(x)  =  -logJp(x)]  =  logt, 


1 

P(x) 


where  P(x)  denotes  the  probability  density  function  (PDF)  of  x,  and 
the  logarithm  base  b  is  arbitrary.  Notice  that  3((x)  is  a  random 
variable.  The  average  self -information  of  a  vector  random  process 
is  the  entropy  of  the  process.  It  follows  that  the  entropy  of  a 
random  vector  X,  denoted  herein  by  .?/(x),  is  defined  as 


(5-36)  ^(x)=  J-J3l(x)p(x)dx  =  - J-jlogJp(x)]p(x)dx 


The  entropy  of  a  random  vector  X  given  a  realization  of  another 
random  vector  z  is  referred  to  as  the  conditional  entropy. 
Conditional  entropy  of  X  conditioned  on  Z,  denoted  herein  by  i?/(x|z), 
is  defined  as 

oo 

(5-37)  ^(X|Z)  =  “  ■ogb[P(^|2)]  dZ 
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where  P(X,2)  and  P(><|z)  denote  the  joint  and  conditional  PDFs, 
respectively,  of  X  and  Z.  Notice  that  i^(xlz)  is  a  random  variable 
also.  The  joint  entropy  of  a  random  vector  X  and  a  random  vector 
Z,  denoted  herein  by  ,  is  defined  as 


(5-38)  i^(x,z)  =  -  logjp(x.z)]  p(x.z)  dx  dz 


Joint  entropy  and  conditional  entropy  satisfy  the  following  key 
relations , 

(5-39)  ^{X,z)  =  ^{x)  +  :H{z\x)  =  0<{z)  +  9{{x\z) 

The  last  concept  of  relevance  to  the  objectives  herein  is  the 
average  mutual  information  between  two  random  vectors,  X  and  Z. 
Most  often  this  concept  is  referred  to  simply  as  mutual 
information,  and  is  a  measure  of  the  information  in  X  about  Z,  or 
equivalently,  the  information  in  Z  about  X.  Mutual  information 
between  the  random  vectors  X  and  Z,  denoted  herein  by  /(x<-^z),  is 
defined  as 

(5-40)  /(x  <r^z)  =  ^{x)-9{{x\z)  =  !H{z)-!H{z\x) 

Using  Equation  (5-39)  it  is  straightforward  to  show  that 
(5-41)  I{x<r^z)  =  !H{x)  +  !H(z)-^H{x,z) 

Notice  that  mutual  information  is  an  entropy-type  quantity. 

In  the  context  of  interest  herein,  the  above-defined  measures 
are  determined  for  the  past  and  future  processes,  X^p  and  X^.q/ 

respectively,  individually  and  jointly.  Recall  that  the  array 
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output  process  {x(n)}  has  mean  zero  and  is  Gaussian-distributed . 
The  complex  Gaussian  PDF  for  a  zero-mean,  M-element  random  vector 
X  is  of  the  form 


(5-42)  p{x)  =  7i:  ^xj 

where  ^  is  the  MxM  covariance  matrix  of  X, 

(5-43)  i^  =  Exx^ 

and  1^1  denotes  the  determinant  of  This  PDF  expression  is 

required  to  represent  each  of  three  cases.  First,  the  PDF  of  Xy.p, 
with  M  =  JP  and  covariance  matrix  second,  the  PDF  of  X^.q, 

with  M  =  JQ  and  covariance  matrix  third,  the  joint  PDF 

of  X^p  and  X^.Q,  with  M  =  J(P+Q)  and  covariance  matrix 

(5-44)  E 

Then,  adopting  the  natural  (base  e)  logarithm  option,  the  entropy 
measures  for  the  three  processes  are  obtained  as 

(5-45)  i^(Xy.p )  =  JP  +  JP  ln[7c]  +  ln[|i??.y.p  p  I] 

(5-46)  ^(Xy:Q)  =  JQ  + JQ  ln[7c]  +  ln[|i?^^.QQ|] 

(5-47)  i^(x2,.p ,  x^.Q )  =  JP  +  JQ  +  JP  ln[7t]  +  JQ  ln[7i:]  +  ln[|^^y.p  p  | 

JQ 

i=1 
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Given  these  entropy  measures,  the  mutual  information  between  the 
past  and  future  of  the  array  output  process  {x(n)}  is  determined  as 
(via  Equation  (5-41)) 

JQ 

(5-48)  /(Xyp  ^X^,q)  =  - ^ln[l-pf 

i=1 


Mutual  information  is  a  measure  of  the  correlation  between  two 
processes  (recall  that  the  {p^}  are  correlation  coefficients)  .  If 

the  past  and  future  processes  are  uncorrelated,  Pj  =  0  for  i  =  1,2, 

JQ,  and  the  mutual  information  is  zero.  If  the  past  and  future 
processes  are  fully-correlated,  Pj  =  1  i  =  1.  2,  .  .  .  ,  JQ,  and  the 

mutual  information  is  infinite.  Equation  (5-48)  combines  all  the 
information  about  the  correlation  of  the  two  processes  in  a 
single,  scalar  measure.  This  compact,  powerful,  and  simple 
relation  is  fundamental  to  dimensionality  reduction. 

For  convenience  and  notational  simplicity,  let  r]  represent 
the  mutual  information  measure,  /(x^j-p  X^.q  ) ,  and  also  define  a  set 

of  variables  {k^  |  i  =  1 , 2,  .  .  .  ,  JQ}  as 


(5-49) 


i  =  1,2.  ...,JQ 


The  {Kj}  are  referred  to  herein  as  the  information  coefficients  for 

the  process  {x(n)}.  As  demonstrated  in  the  next  section,  these 
parameters  provide  a  means  for  dimensionality  reduction  in  the 
context  of  the  MC  algorithm.  Similarly,  they  provide  a  means  for 
model  order  reduction  in  the  parametric,  sequential  implementation 
of  the  MC  algorithm  (Roman  et  al . ,  1998) . 
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5 . 2  Dimensionality  Reduction  Criteria 


The  MC  algorithm  reduces  the  dimensionality  of  the  problem  in 
relation  to  the  MF  approach  in  two  ways.  First,  the  MC  algorithm 
operates  on  matrices  with  JQxJQ,  or  JQxJP,  or  JPxJP  elements, 
whereas  the  MF  algorithm  requires  the  inverse  of  a  JNxJN  matrix. 
In  most  cases  this  reduction  is  significant  because  P  and  Q  are 
selected  such  that  Q  <  P  «  N.  Second,  further  reduction  is 
achieved  by  retaining  only  the  L  most  significant  canonical 
variables,  with  L<  JQ. 

These  two  dimensionality-reducing  mechanisms  are  linked  in  a 
subtle  way.  Specifically,  if  P  and  Q  are  large-valued,  then  it  is 
likely  that  more  canonical  correlations  are  negligible  (L  is 
small)  than  when  P  and  Q  are  small -valued.  Viewed  in  a  different 
way,  if  many  canonical  correlations  are  negligible  for  the  P  and  Q 
values  selected,  then  it  is  likely  that  P  and  Q  are  too  large,  and 
a  smaller-valued  pair  can  be  adopted  with  negligible  loss  of 
information. 

Dimensionality  reduction  via  the  data  residual  vector 
dimensionality  parameter,  L,  is  the  focus  of  this  section.  Such 
reduction  is  accomplished  with  the  aid  of  criteria  based  on  the 
canonical  correlations.  Criteria  of  this  type  have  been  discussed 
by  Roman  and  Davis  (1993a;  1993b)  in  the  context  of  model  order 
selection  for  a  state  variable  parametric  model  for  the  ground 
clutter  in  an  airborne  surveillance  phased  array  radar  system. 
Two  of  the  criteria  discussed  by  Roman  and  Davis  (1993a;  1993b) 
are  of  relevance  herein.  In  addition,  two  criteria  based  on  the 
residual  covariance  matrix  are  proposed. 

Canonical  Correlations  Criterion.  This  criterion  consists  of 
examining  all  the  canonical  correlations  to  determine  the  integer 
ig  for  which  {pj  |  i  =  1 , 2,  .  .  .  ,  ig}  are  non-negligible,  and  {pj  i  i  =  io+1,  io+2,  .  .  . 
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,  JQ}  are  negligible  (the  so-called  " knee -in -the- curve " ) .  Then,  L 
is  selected  equal  to  Iq  .  If  the  canonical  correlations  exhibit  a 
continuous  variation  (fail  to  exhibit  " knee - in- the- curve " 
behavior)  ,  then  the  value  of  ip  is  selected  such  that  a  pre-set 
limit  to  the  sum  of  the  canonical  correlations  is  met.  Variations 
to  this  approach  can  be  defined  wherein  the  behavior  of  a  function 
of  the  canonical  correlations  (square;  information  coefficients; 
etc.)  is  considered  instead.  Such  is  the  context  of  the  other  two 
criteria  herein. 


Residual  Covariance  Matrix  Criteria.  Two  distinct  criteria  can  be 
defined  based  on  the  residual  covariance  matrix,  Q.  One  criterion 
is  based  on  the  trace  of  fl,  whereas  the  other  is  based  on  its 
determinant . 


Consider  the  residual  covariance  matrix  for  the  full- 
dimensionality  configuration  (L=JQ),  and  let  ^  denote  the  trace 
of  Q  normalized  by  the  scalar  term  2JQ;  that  is. 


(5-50) 


1 

2JQ 


tr[n]  =  1- 


1 

JQ 


where  tr[.]  denotes  the  trace  operator.  The  first  criterion  function 
is  then  defined  as  the  normalized  total  residual  variance  (as 
measured  by  the  trace  operator)  for  the  first  L-element  subset  of 
the  canonical  variables;  namely. 


1  <  L  <  JQ 
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Notice  that  ^(1)  <  .  .  .  <  ^(JQ-1)  <  ^(JQ)  =  1  .  Further,  if  the  canonical 
correlations  are  strictly-monotonic  and  non- zero,  1 >  >  P2  >  .  .  .  >  pjQ 

>  0,  then  i^(L)  is  a  monotonically- increasing  function  (notice  that 
p^  =  1  is  allowed  for  strict  monotonicity)  . 

For  the  second  criterion,  consider  again  the  residual 
covariance  matrix  for  the  full-dimensionality  configuration  (L= 
JQ) ,  and  let  ^  represent  the  determinant  of  normalized  by  the 
scalar  term  that  is, 

JQ 

^  =  _det[Q]  =  J^(l-Pi) 

^  i=1 

where  det[*]  denotes  the  determinant  operator.  The  second  criterion 
function  is  then  defined  as  this  quantity  normalized  by  the 
partial  residual  variance  (as  measured  by  the  determinant 
operator)  obtained  with  a  configuration  consisting  of  the  first  L- 
element  subset  of  the  canonical  variables;  namely, 

JQ 

,  JQ 

(5-53)  4(L)  =  -;— i - =  -!d - =n(1-Pi)  1<L<JQ 

ri(i-Pi)  n(i-Pi) 

i=1  i=1 

Notice  that  ^(1)  <  .  .  .  <  ^(JQ-1)  <  ^(JQ)  =  1  if  all  Pj  <  1  .  Function  ^(L)  is 

a  monotonically- increasing  function  if  the  canonical  correlations 
are  strictly-monotonic,  non-unity,  and  non- zero,  1  >  p.|  >  P2  >  .  .  .  >  pjQ 

>  0.  This  criterion  is  inadequate  when  at  least  one  canonical 
correlation  is  unity.  Such  cases,  however,  represent  pathological 
conditions . 

For  both  criteria,  the  value  of  the  criterion  function  at 
argument  L  is  indicative  of  the  portion  of  the  total  residual 
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variance  (the  variance  for  the  full -dimensional  configuration,  L= 
JQ)  that  is  retained  with  the  first  L-element  subset  of  the 
canonical  variables.  Each  criterion  thus  provides  a  measure  of 
the  degree  of  information  retained  in  a  configuration  of 
dimensionality  L  for  the  residual  vector.  The  larger  the  value  of 
L,  the  higher  the  degree  of  information  retained  in  the 
configuration.  Thus,  for  each  criteri'on,  specification  of  a 
percentage  threshold  allows  selection  of  the  parameter  L  as  the 
index  associated  with  the  total  residual  variance  (as  measured  by 
either  the  trace  or  the  determinant)  that  exceeds  the  pre¬ 
specified  threshold. 

Mutual  Information  Criterion.  Based  on  the  development  and 
definitions  of  Section  5.1,  the  mutual  information  between  the 
past  and  future  of  the  process  {x(n)}  is  the  sum  of  the  JQ 
information  coefficients  (recall  Equation  (5-49) ) , 

JQ 

(5-54)  n  =  ^Kj 

i=1 

A  related  quantity,  the  normalized  mutual  information  for  the 
first  L-element  subset  of  the  canonical  variables,  is  obtained  as 

JQ 

i=1 

The  value  of  this  function  represents  the  fraction  of  the  mutual 
information  in  the  past  about  the  future  that  is  retained  with  the 
first  L-element  subset  of  the  canonical  variables.  Thus, 
specification  of  a  percentage  threshold  for  the  mutual  information 
allows  selection  of  the  parameter  L  as  the  index  associated  with 


(5-55) 


ti(l)=— yKj= 

^  tr 
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the  normalized  mutual  information  that  exceeds  the  pre-specif ied 
threshold. 

Roman  et  al .  (1997)  applied  the  normalized  mutual  information 
criterion  to  state  space  model  identification  using  simulated  and 
measured  airborne  phased  array  radar  data.  The  effects  of 
variations  in  scenario  parameters  such  as  crab  angle,  platform 
velocity,  and  clutter- to-noise  ratio  (CNR)  were  considered  for  the 
simulated  data.  The  results  presented  therein  indicate  that 
highly-representative  state  variable  models  can  be  identified 
using  low-dimensionality  covariance  matrices. 

Remarks  .  In  summary,  another  important  feature  of  the  MC 
algorithm  is  the  fact  that  a  simple  and  powerful  criterion  is  used 
to  reduce  dimensionality  or  assess  information  losses.  That  is, 
either  determine  the  number  of  canonical  variables  to  keep  for 
achieving  a  pre-specif ied  level  of  a  probabilistic  measure  (total 
variance;  mutual  information) ,  or  assess  the  loss  of  correlation 
information  suffered  by  keeping  only  L  canonical  variables. 

The  impact  (if  any)  on  detection  performance  of  the  AMC  due 
to  large  reductions  in  dimensionality  remains  to  be  assessed. 
Preliminary  work  carried  out  for  the  related  PAMF-CC  suggests  that 
considerable  reductions  in  dimensionality  are  possible  while 
surpassing  the  performance  of  the  AMF  (Roman  et  al . ,  1998). 
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6.0  ORTHOGONAL  PROJECTION  WITH  CANONICAL  VARIABLES  (OPCV) 

Application  of  the  orthogonal  projection  principle  to  the 
past  and  future  canonical  variables  results  in  an  important 
extension  to  the  MC  formulation.  This  enhanced  formulation  is 
based  on  determination  of  the  orthogonal  projection  of  the  space 
spanned  by  the  "future"  canonical  variables  onto  the  space  spanned 
by  the  "past"  canonical  variables.  Such  projection  is  an  optimal 
prediction  of  the  future  given  the  past,  and  the  resulting 
prediction  error  vector  constitutes  the  residual  vector  in  the 
generic  STAP  architecture.  The  orthogonal  projection  with 
canonical  variables  (OPCV)  algorithm  is  an  application  of 
orthogonal  projection  in  the  context  of  a  structure  wherein  the 
array  output  data  (past  and  future  vectors)  are  represented  in  the 
canonical  variables  basis.  As  such,  the  OPCV  for  the  full¬ 
dimensional  residual  case  (L=JQ)  is  equivalent  analytically  to 
the  OP.  However,  an  important  distinction  between  the  OPCV  and 
the  OP  is  that  the  OPCV  includes  an  optimal  mechanism  for 
dimensionality  reduction  (selection  of  L<JQ),  which  the  OP  lacks. 

Full-Dimension  Residual  Configuration  (L.  =  JQl-  As  indicated  in 
Section  4.0,  an  orthogonal  projection  in  the  vector  spaces 
considered  herein  is  equivalent  to  conditional  expectation.  Also, 
recall  from  Section  5.0  that  l)(n)  and  jj,(n)  denote,  respectively,  the 
JQ  -element  future  and  JP-element  past  canonical  variables  for  the 
array  output  process.  Now  let  9^  denote  the  space  spanned  by  jl(n) 
(notice  that  9^  =  X~ )  ,  and  let  denote  the  expectation  of 
the  future  canonical  variables  conditioned  on  the  past  canonical 
variables.  Then,  for  appropriately- selected  integers  P  and  Q, 

(6-la)  h(n|f7^f)  =  E[v(n)^>)](E[^(n)^H(n)]|-''^ 
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(6-lb) 


□2  ]  ^(n)  =  [ 


[0]  ] 


(6-lc)  ;u(n|?l^)  =  D^p,^(n) 

where  %(n)  and  itH(n)  are  JQ-  element  and  J(P-Q)  -element  vectors, 
respectively,  as  defined  in  Equations  (5-6),  and  where  Equations 
(5-11),  (5-12),  and  (5-17)  have  been  invoked.  Consider  now  the 

OPCV  algorithm  in  the  context  of  the  generic  STAP  architecture 
(Figure  2-1) .  In  this  context,  the  JQ-element  OPCV  variables  a(n) 
and  m  are  defined  in  terms  of  the  canonical  variables  as 

(6-2)  a(n)  =  \)(n) 

(6-3)  g(n)=  ;o(n|f)l/)  =D^ii^(n) 

respectively.  The  OPCV  JQ-element  data  and  steering  residuals  are 
generated  as 


(6-4a) 

e(n)  =  a(n)  -  g(n)  =  v{n)  -  ) 

n  =  P, . 

. . ,  N-Q 

(6-4b) 

£(n)  =  2i(n)  -  D,  u^,(n)  =  -  D,  v”c  x„(,n) 

n  =  P,. 

. . ,  N-Q 

(6-5) 

y(n)  =  wl3,oe^^Q(n)-D,vUce„(n) 

n  =  P,., 

. . ,  N-Q 

respectively,  and  where  and  V|^q  denote  the  weight  matrices  for 
the  MC  algorithm  (Equations  (5-19)  and  (5-20),  respectively). 
From  these  expressions  it  follows  that  the  OPCV  algorithm  weight 
matrices  W  and  V  are  defined  as 

(6-6)  W  = 
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(6-7) 


V  =  V^cD,  =  =^^PpT2vD, 


respectively.  Next,  the  OPCV  residual  covariance  matrix  is 
obtained  from  Equations  (2- 13b) ,  (6-6) ,  and  (6-7) ,  as 

1-pf  0  0 

0  1-p^  0 

0  0  ■■  1-p5q 

Covariance  matrix  Q  generated  according  to  Equation  (6-8)  is  the 
model  residual  covariance  referred  to  in  Remark  2.2.4. 

The  normalized  and  uncorrelated  (temporally  and  spatially) 
JQ  -element  data  and  steering  residuals  for  the  OPCV  algorithm  are 

(6-9)  Y(n)  =  c'^e(n)  n  =  P,  P+1, . . . ,  N-Q 

(6-10)  s(n)  =  c'^u(n)  n  =  P,  P+1, ...,  N-Q 

The  JQxJQ  weight  matrix  C  is  a  diagonal  matrix  specified  as 

(6-11)  =  B"'' = 

—1/2 

where  Q.  is  the  inverse  square-root  matrix  of  the  data  residual 
diagonal  covariance  matrix.  The  detection  test  statistic  is 
generated  using  Equation  (2-17) .  This  completes  the  formulas  for 
the  L  =  JQ  configuration. 

Reduced-Dimension  Residual  Configuration  (L<JQ_)_.  Cases  wherein 
the  data  residual  vector  dimensionality  is  L  <  JQ  arise  due  to 
processing  and/or  performance  requirements,  as  stated  in  Section 


60 


5.0.  In  the  OPCV  algorithm  formulation  for  such  cases  an 
orthogonal  projection  is  desired  from  a  subset  of  the  future 
canonical  variables  onto  the  past  canonical  variables. 
Specifically,  for  L<JQ,  only  the  variables  {\)j(n)  |  i  =  1 , 2,  .  .  .  ,  L}  and 

{(ii(n)  I  i  =  1 , 2 . L),  and  the  correlation  coefficients  {pj  |  i  =  1,2, 

are  retained.  Thus,  the  formulas  for  the  calculation  of  the 
weight  matrices  must  be  modified.  Determination  of  the  value  of  L 
is  considered  in  Section  6.2. 

For  the  case  L  <  JQ  it  is  convenient  to  define  an  LxJQ 
selector  matrix  Sl  as 

(6-12)  Sl  =  [Il  [0]ljq_J 


Then,  the  sub-vector  of  l)(n)  formed  from  the  first  L  future 
canonical  variables  is  denoted  as 

'u/n)' 

(6-13)  ^  1<L<JQ 

Now  let  ;0|_(n|f^)  denote  the  expectation  of  the  first  L  future 

canonical  variables  conditioned  on  the  past  canonical  variables. 
Then,  for  appropriately- selected  integers  P  and  Q, 

(6-i4a)  «Jn|5K')  =  E[s)L(n)^H(n)](E[ji(n)^»(n)])'V(n)  =  SLR„p|i(n) 


(6-14b)  2L(n|«')  =  [lL  [0]ljq_J 


^11  t®ljO-L  I®i,J(P-Q) 

[0]jQ  -L.J(P-Q) 


(6-14c)  UL(n|5V^)  =  DiiUL(n) 
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where  Equations  (5-11),  (5-12),  (5-17),  (5-34),  and  (6-12)  have 

been  invoked,  and  where  il|_(n)  is  a  vector  of  the  first  L  past 

canonical  correlations. 


(6-15) 


1  <  L  <  JQ 


Given  these  relations,  the  L-element  OPCV  variables  Oc(n)  and  ^(n) 
are  defined  as 


(6-16)  a(n)  =  \)L(n) 

(6-17)  ^(n)  = 

respectively,  and  the  OPCV  L-element  data  and  steering  residuals 
are  generated  as 

(6-i8a)  e(n)  =  a(n) -g(n)  =i)|_(n)- ;U|_(n|5Vf)  n  =  P,  ...,N-Q 

(6- 18b)  e(n)=i)L(n)-DiiiiL(n)=  n  =  P, . . . ,  N-Q 

(6-19)  u(n)  =  WMCL§^.Q(n)-DiiVMCL§y:p(n)  n  =  P,  ...,N-Q 

respectively.  In  these  expressions,  Wj^ql  and  V|^q|_  denote  the  JQxL 
and  JPxL,  respectively,  weight  matrices  for  the  reduced- 
dimensionality  MC  algorithm  (Equations  (5-31)  and  (5-32), 
respectively) .  It  follows  that  the  reduced-dimensionality  OPCV 
algorithm  weight  matrices  W  and  V  are  defined  as 

(6-20)  w  =  = 
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(6-21) 


V  =  V„eLD„  =  ^-''5.T2AD„ 

respectively.  These  weight  matrices  are  substituted  in  Equation 
(2-13)  to  obtain  the  LxL  OPCV  residual  covariance  matrix  as 

1-p2  0  •••  o' 

0  1-p2  •••  0 

0  0  1-p^ 

Weight  matrix  C  is  calculated  using  Equation  (6-11) ,  with  Q.  as  in 
Equation  (6-22).  The  detection  test  statistic  is  generated  also 
via  Equation  (2-17)  .  This  completes  the  formulas  for  L<JQ. 

The  OPCV  algorithm  admits  an  adaptive  formulation  also,  which 
is  referred  to  as  the  adaptive  OPCV  (AOPCV) .  In  the  AOPCV  each 
unavailable  block  covariance  matrix  is  replaced  by  its  ML 
estimate,  obtained  as  an  ensemble  average  over  the  secondary  data. 
As  is  the  case  for  the  other  two  algorithms,  the  ML  estimate  of 
the  residual  covariance  matrix  (obtained  as  an  ensemble  average 
over  the  residuals  for  the  secondary  data  set)  is  preferred  over 
the  model  residual  covariance  estimate  (generated  via  either 
Equation  (6-8)  or  Equation  (6-22)  using  the  estimated  canonical 
correlations) . 

Since  the  OPCV  algorithm  is  based  on  the  canonical  variables 
basis,  the  comments  made  for  the  MC  algorithm  in  Section  5.0 
regarding  issues  such  as  adaptability,  dimensionality  reduction, 
implementation  aspects,  computational  issues,  and  algorithm 
structure  issues  are  valid  also  for  the  OPCV.  However,  issues 
unique  to  the  OPCV  are  discussed  in  the  sections  that  follow. 


(6-22)  Q  =  = 
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6 . 1  Optimality  Of  The  OPCV 


Yohai  and  Garcia  Ben  (1980)  have  demonstrated  that  the 
canonical  variables  constitute  the  optimal  solution  to  a  reduced- 
dimensionality  linear  prediction  problem.  In  the  context  of  the 
notation  and  formulation  adopted  herein,  the  problem  addressed  in 

(Yohai  and  Garcia  Ben,  1980)  is  stated  as  follows.  Given  the  JQ- 
element  future  vector  and  the  JP-element  past  vector  Xy,p(n), 

then,  for  L<  JQ,  determine  a  JPxL  linear  transformation  T  on  the 
past  process  of  the  form 

(6-23)  z(n)  =T^x^p(n) 

such  that  an  orthogonal  projection  of  the  space  spanned  by  Xj.q  (n) 
onto  the  space  spanned  by  z(n),  denoted  as  Z>,  minimizes  the 
criterion 


(6-24) 


J(T)  =  det 


H 

[%:Q(n)  -  x^.Q(n|Z  )][x^.Q(n)  -  x^.Q(n|2; )] 


where  X^.Q(n|Zl)  is  the  orthogonal  projection. 


(6-25)  x^.Q(n|2;)  =  E[x^.Q(n)z^(n)](E[z(n)z>)])  ^z(n) 


Notice  that  2{n)  as  defined  in  Equation  (6-23)  is  an  L-element 
vector,  which  implies  that  Z>  cz  X  .  This  completes  the  statement 
of  the  problem  formulated  by  Yohai  and  Garcia  Ben  (1980) . 

The  optimization  problem  stated  above  admits  as  one  solution 
that  matrix  T  be  selected  as  the  JPxL  weight  matrix  the  past 

canonical  variable  transformation  for  the  case  L  <  JQ  (see  Equation 
(5-32)),  and  the  transformed  variables  are  the  first  L  past 
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canonical  variables.  Specifically,  one  optimal  solution,  denoted 
by  superscript  "o" ,  is 

(6-26)  = 

( 6 - 27 )  z»(n)  =  fijn)  =  vJJcL  X3,.p(n)  =  T"*  p(n) 

Other  equivalent  optimal  solutions  are  generated  by  selecting  the 
transformation  matrix  to  be  of  the  form  T  =  V.,„,  F,  where  F  is  a 

full-rank  LxL  matrix.  More  specifically,  matrix  F  is  any  general 
full-rank  matrix  if  Pl  >  Pl+i  '  ^  linear  function  of  the 

singular  vectors  for  repeated  canonical  correlations  if  Pq4.i  =  ’"“PL 

for  some  q  such  that  1  <q<L.  The  minimum  value  of  the  criterion 
J(T)  (Equation  6-24) ) ,  denoted  by  J°(T),  is 

(6-28)  J»(T)  =  det[)^^^„,,]n(l-pf) 

i=1 

This  completes  the  solution  presented  by  Yohai  and  Garcia  Ben 
(1980) ,  with  minor  modifications  to  fit  the  context  herein. 

In  summary,  two  important  points  are  established  by  the 
Yohai-Garcia  Ben  result.  First,  the  past  canonical  variables 
constitute  an  optimal  basis,  with  respect  to  Criterion  (6-24),  to 
represent  an  L-dimensional  proper  sub-space  of  X~  .  Second,  the 
solution  provided  by  the  past  canonical  variables  has  the  simplest 
form,  since  all  other  solutions  involve  an  additional  matrix 
factor,  F. 

The  Yohai-Garcia  Ben  result  is  significant  in  the  context  of 
the  reduced-dimensionality  OPCV  because  it  states  that  the 
selection  of  the  variables  a(n)  and  ^(n)  as  in  Equations  (6-16)  and 
(6-17) ,  respectively,  is  optimal  according  to  Criterion  (6-24)  . 
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Specifically,  with  x^.Q(n)  expressed  as  l)|_(n),  and  X^p(n)  expressed  as 
the  Yohai-Garcia  Ben  result  states  that  z(n)  =  ,  and 

Criterion  (6-24)  becomes  J(T)  =  det[Q]  since  Q  =  E[e(n)8H(n)]  when  the 

data  residual  e(n)  is  generated  via  Equation  (6-18).  With  Q  given 
as  in  Equation  (6-22),  the  optimal  criterion  value  is  then 

L 

(6-29)  J°(T)=  J^(l-pf)  FOR  OPCV  ALGORITHM 

i=1 

Equation  (6-29)  results  from  Equation  (6-28)  when  X^  :Q(n)  is 
expressed  as  l)|_(n)  (that  is,  in  canonical  variables  basis)  because 
the  covariance  matrix  of  2)|_(n)  is  1^^. 

The  optimal  criterion  value  in  Equation  (6-29)  can  serve  as  a 
criterion  for  reducing  the  dimensionality  of  the  data  residual 
vector  in  the  OPCV  algorithm  (selecting  L) ,  which  is  one  of  the 
two  ways  in  which  dimensionality  reductions  can  be  attained  with 
the  MC  and  OPCV  algorithms  (see  Section  5.2) .  It  turns  out  that 
one  of  the  two  residual  covariance  matrix  criteria  types  proposed 
in  Section  5.2  leads  to  an  identical  expression.  This  issue  is 
explored  next,  in  Section  6.2. 

6.2  Dimensionality  Reduction  Criteria 

Since  the  OPCV  algorithm  is  based  on  the  MC  algorithm,  the 
issues  and  criteria  for  dimensionality  reduction  discussed  in 
Section  5.2  are  applicable  also  to  the  OPCV.  The  exception 
involves  the  criteria  based  on  the  residual  covariance  matrix,  Q, 
because  each  algorithm  has  a  different  residual  covariance  matrix. 
Thus,  only  the  residual  covariance  matrix  criteria  are  discussed 
herein.  As  in  Section  5.2,  two  distinct  criteria  are  defined 
based  on  one  based  on  its  trace,  the  other  based  on  its 

determinant.  Both  criteria  have  the  same  analytical  form  as  their 
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MC  counterparts.  A  key  distinction,  however,  is  that  the  MC 
criteria  are  functions  of  p^,  whereas  the  OPCV  criteria  are 
2 

functions  of  Pj  . 


Residual  Covariance  Matrix  Trace  Criterion.  Consider  the  residual 
covariance  matrix  for  the  full -dimensionality  configuration  (L = 
JQ) ,  and  let  denote  the  trace  of  Q  normalized  by  the  scalar  term 
JQ;  that  is. 


(6-30) 


C  =  — tr[a]  =  1-  — 
^  JQ  ^  ^  JQ 


The  first  criterion  function  is  defined  as  the  normalized  total 
residual  variance  (as  measured  by  the  trace  operator)  for  the 
first  L-element  subset  of  the  canonical  variables;  namely. 


1  <L<  JQ 


Notice  that  (^(1)  <  .  .  .  <  ^(JQ-1)  <  C(JQ)  =  1  •  Further,  if  the  canonical 
correlations  are  strictly-monotonic  and  non-zero,  1 > p^ >  P2  > .  .  .  >  pjQ 

> 0,  then  ^(L)  is  a  monotonically- increasing  function  (notice  that 
p.|  =  1  is  allowed  for  strict  monotonicity)  . 

Residual  Covariance  Matrix  Determinant  Criterion.  As  before, 
consider  the  residual  covariance  matrix  for  the  full- 
dimensionality  configuration  (L  =  JQ),  and  let  ^  represent  the 
determinant  of  Q, 


67 


JQ 

(6-32)  ^  =  det[Q]  =  p^(l-pf) 

i=1 

The  second  criterion  function  is  defined  as  this  quantity 
normalized  by  the  partial  residual  variance  (as  measured  by  the 
determinant  operator)  obtained  with  a  configuration  consisting  of 
the  first  L-element  subset  of  the  canonical  variables;  namely, 

JQ 

t  n(^~pp)  JO 

(6-33)  yL)  =  -;— 2 - =  - =nO"Pi)  1SL<JQ 

no-pi)  no-pi) 

i=1  i=1 

Notice  that  ^(1)  <  .  .  .  <  ^(JQ-1)  <  ^(JQ)  =  1  if  all  Pj  <  1  .  Function  ^(L)  is 

a  monotonically- increasing  function  if  the  canonical  correlations 
are  strictly-monotonic ,  non-unity,  and  non-zero,  1  > p^  >  Pg  >  .  .  ■  > pjQ 

>0.  As  for  its  MC  counterpart,  this  criterion  is  inadequate  when 
at  least  one  canonical  correlation  is  unity;  however,  the 
occurrence  of  canonical  correlations  of  exactly  unity  value  is  an 
unlikely  event  due  to  uncorrelated  noise  and  computational  errors. 
Notice  that  the  un-normalized  criterion  ^  of  Equation  (6-32)  with 
JQ  replaced  by  L  is  identical  to  the  minimum  value  of  the  Yohai- 
Garcia  Ben  criterion  for  the  case  when  the  array  output  data  is  in 
canonical  variables  basis,  Equation  (6-29) . 

Remarks .  For  both  criteria,  the  value  of  the  criterion  function 
at  argument  L  is  indicative  of  the  portion  of  the  total  residual 
variance  (the  variance  for  the  full-dimensional  configuration,  L= 
JQ)  that  is  retained  with  the  first  L-element  subset  of  the 
canonical  variables.  Each  criterion  thus  provides  a  measure  of 
the  degree  of  information  retained  in  a  configuration  of 
dimensionality  L  for  the  residual  vector.  The  larger  the  value  of 
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L,  the  higher  the  degree  of  information  retained  in  the 
configuration.  Thus,  for  each  criterion,  specification  of  a 
percentage  threshold  allows  selection  of  the  parameter  L  as  the 
index  associated  with  the  total  residual  variance  (as  measured  by 
either  the  trace  or  the  determinant)  that  exceeds  the  pre¬ 
specified  threshold. 


Residual  Covariance  Matrix  Comparison:  MC  And  OPCV 


The  OPCV  residual  is  less  than  or  equal  to  the  MC  residual 
for  all  configurations  (L  <  JQ)  .  This  is  expected  since  the 

orthogonal  projection  is  the  best  linear  predictor  of  a  JQ-element 
vector  X^.Q  (n)  based  on  a  JP-element  vector  X2>p(n)  for  a  variety  of 

criteria,  including  minimization  of  each  of  the  following  two 
functions  of  the  prediction  error,  £(n), 

(6-34a)  J2(T)  =  E[l|e(n)ll2]  =  E  x^.Q(n)-x^.Q(n|X“)  ^  =  ^  ||  II2 

( 6 -34b)  J2(T)  =  tr[E[e(n)e^^(n)]]  =  tr  e|^ [%.Q(n) - X2,.p(n)][x^.Q(n) - X2,.p(n)]^ 
(6-35a)  JD(T)  =  det[E[8(n)eH(n)]] 


(6-35b)  Jp(T)  =  det  E  x^.Q(n)-x^.Q(n|;\C  )][%.Q(n)-x^.Q(n|;C  ) 


(6-35c)  Jo(T)  =  det  E  [x^.Q(n)-T^Xy.p(n)][x^.Q(n)-T^X2,.p(n)]^ 

where  T  denotes  a  JPxJQ  linear  transformation,  and  ||•||2  denotes 

the  vector  Euclidean  norm  (or  2-norm) .  The  orthogonal  projection 
solution  is 
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(6-36) 


><^.Q(n)x5.p(n)](E[xy.p(n)x5.p(n)]) 


Minimality  of  the  OPCV  algorithm  in  relation  to  the  MC  algorithm 
is  demonstrated  for  each  of  these  two  criteria. 


Consider  first  the  Euclidean  norm  (covariance  matrix  trace) 
criterion,  J2(T),  for  the  general  case  L<JQ.  For  this  criterion  it 
is  necessary  to  demonstrate  that  tr[f2|y^Q]  >  tr[QQpQy  ] .  This  inequality 

is  true  if  the  following  condition  is  satisfied, 

L  L 

(6-37)  2L-2^Pi  >  L-^pp  1<L<JQ 

i=1  i=1 

Straightforward  manipulation  of  Inequality  (6-37)  leads  to  the 
expression 


L 

(6-38)  ^(pf-2pi  +  l)  >  0  1<L<JQ 

i=1 

Inequality  (6-38)  is  satisfied  if  pp-2pj+1>0  is  satisfied  for 

each  value  of  i  =  1,2,  The  left-hand-side  of  this  expression 

is  a  quadratic  polynomial  with  double  root  at  pj  =  1 ,  with  value  1 

at  Pj  =  0,  and  positive -valued  over  the  domain  of  definition  for  the 
canonical  correlations:  0  <  Pj  <  1  for  i=1,2,  Thus,  condition 

-  M^OPCv]  is  indeed  true . 

Consider  next  the  covariance  matrix  determinant  criterion, 
Jd(T),  for  the  general  case  L<JQ.  For  this  criterion  it  is 
necessary  to  demonstrate  that  det[Qj^Q]  >  det[QQpQy  ] .  This  inequality 

is  true  if  the  following  condition  is  satisfied. 
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(6-39) 


1  <  L  <  JQ 


n2(i-Pi)  s  no-pn 


straightforward  manipulation  of  Inequality  (6-39)  leads  to  this 
equivalent  expression 


(6-40) 


1  <  L  <  JQ 


Inequality  (6-40)  is  satisfied  if  Pj^1  is  satisfied  for  each  value 

of  i  =  1,  2,  .  .  .  ,  L,  which  is  satisfied  indeed  over  the  domain  of 
definition  for  the  canonical  correlations.  Thus,  condition 
det[Q^^c]- ^®^[^0PCv]  true. 


In  summary,  the  OPCV  algorithm  residual  is  less  than  or  equal 
to  the  MC  algorithm  residual  for  all  configurations  (L<JQ),  and 
according  to  both  optimization  criteria. 

An  important  point  to  notice  is  that  the  MC  algorithm  could 
out-perform  the  OPCV  algorithm  due  to  software  implementation  and 
computational  accuracy  issues,  even  though  it  is  less  optimal  in 
theory.  For  example,  if  the  numerical  stability  of  the  estimates 
of  the  canonical  correlations  is  poor,  then  the  OPCV  weight  matrix 
V  computed  via  Equation  (6-7)  is  less  accurate  than  the  MC  weight 
matrix  V  computed  via  Equation  (5-20) .  This  issue  should  be 
pursued  further  via  simulation-based  analyses. 

6.4  Past-Only  OPCV  Configuration 

The  Yohai -Garcia  Ben  result  discussed  in  Section  6.1  suggests 
a  variation  of  the  OPCV  algorithm  in  which  only  the  past  block 
vector  is  transformed  onto  the  canonical  variables  basis.  In  such 
a  configuration,  referred  to  herein  as  the  Past-Only  OPCV 
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configuration,  only  the  past  canonical  transformation  matrix,  V[^q, 
is  generated  and  applied  to  the  past  block  vector,  Xy,p  (n).  Thus , 

the  weight  matrices  W  and  V  in  Figure  2-1  for  the  Past -Only  OPCV 
configuration  of  reduced  dimensionality  for  the  residual  vector  (L 
<JQ)  are 


(6-41)  W  =  ljQ 


(6-42) 


V  =  V^clVmcl 


V:Q,!P:P 


<^-1/2 

"^!P:P,P 


T2aT2A^ 


-1/2 

T.PP 


respectively.  In  these  expressions,  the  JPxL  weight  matrix  V|^qi_  is 
as  determined  via  Equation  (5-32)  .  Comparing  these  weight 
matrices  with  the  corresponding  ones  for  the  OPCV  configuration 
(Equations  (6-20)  and  (6-21)),  notice  that  weight  matrix  V  has  a 
simpler  form  in  the  OPCV  configuration,  whereas  the  reverse  is 
true  for  weight  matrix  W. 


Consider  now  the  covariance  matrix  of  the  data  residual  . 
Using  Equation  (2-13),  Q.  is  obtained  as 


(6-43) 


r.Q,Q 


^f  :Q,T:P  ^T:P,P  *^2A  '  2A 


■OA  A.j,.pp  A.^;Q_y;P 


straightforward  algebraic  manipulations  result  in  the  following 
equivalent  expression  for  iQ, 


(6-44) 


0=4?  ■'‘'2  j 

■^r.Q,Q  1 


'l-Dfi 

[0] 


[0] 

'jQ-L 


<TP  1/2 

'l  ^y:Q,Q 


Equation  (6-44)  leads  to  a  simple  form  for  the  determinant  of  Q.; 
namely. 
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L 

(6-45)  det[a]  =  det[^?^^^QQ]  J~[(l-pf) 

i=i 

This  expression  is  identical  to  the  optimal  value  for  the 
criterion  of  the  Yohai-Garcia  Ben  result  discussed  in  Section  6.1, 
which  is  expected  since  the  criterion  expression  (Equation  (6-24) ) 
becomes  J(T)  =  det[a]  for  the  Past-Only  OPCV  configuration. 

Weight  matrix  C  and  the  detection  statistic  are  generated  via 
formulas  analogous  to  those  stated  in  Section  6.0  for  the  reduced- 
dimension  residual  configuration. 

The  Past -Only  OPCV  may  have  advantages  over  the  OPCV  in  terms 
of  numerical  accuracy  and/or  computational  load  because  only  one 
block  vector  is  transformed.  Generation  of  the  model  residual 
covariance  for  the  Past-Only  OPCV  requires  more  computations  than 
for  the  OPCV;  however,  generation  of  the  residual  covariance  is  an 
irrelevant  issue  in  the  context  of  the  generic  architecture 
because  the  approach  preferred  for  either  algorithm  is  to  use  the 
sample  residual  covariance  (which  involves  the  same  procedure  for 
either  algorithm) .  Further  comparisons  between  OPCV  and  the  Past- 
Only  OPCV  remains  as  a  task  for  future  work. 

6 . 5  OP  And  OPCV  (L  =  JQ  Configuration)  Equivalence 

Consider  the  OPCV  algorithm  for  the  full -dimension  residual 
configuration;  that  is,  with  L  =  JQ .  For  this  case  the  OPCV  is 
equivalent  to  the  OP  with  the  array  output  data  represented  in  the 
canonical  variables  basis,  as  shown  next. 

In  the  context  of  the  generic  STAP  architecture  in  Figure  2- 

1,  assume  the  array  output  data  is  represented  in  the  canonical 
variables  basis.  That  is,  the  JQ-  element  block  vector  :Q(n)  is 
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replaced  by  the  JQ-element  vector  ij(n),  and  the  JP-element  block 
vector  Xy.p(n)  is  replaced  by  the  JP-element  vector  Ji(n) .  Under 

these  conditions,  the  OP  weight  matrix  W^p  is  the  JQxJQ  identity, 

as  before  (Equation  (4-6)), 

(6-46)  “  *JQ 

and  the  JPxJQ  OP  weight  matrix  V^p  is  (via  Equation  (4-7)), 

(6-47)  V^p  =  R„^  =  [D,  I  [0]] 


where  is  as  defined  previously.  The  JQxJQ  OP  data  residual 
covariance  matrix  is  obtained  as  (via  Equation  (4-13)), 


(6-48) 


^OP  ~  *JQ 


1-p?  0 

0  1-pI 


0 

0 


0 


which  is  identical  to  the  OPCV  data  residual  covariance  matrix. 
Equation  (6-8)  .  Thus,  the  OP  with  the  array  output  data 
represented  in  the  canonical  variables  basis  is  equivalent  to  the 
OPCV  for  the  full -dimension  residual  configuration,  L=JQ. 


The  advantages  of  the  OPCV  over  the  OP  include  that  the  OPCV 
structure  allows  for  dimensionality  reduction  in  the  residual 
vector,  and  that  one  approach  to  dimensionality  reduction  with  the 
OPCV  is  based  on  the  optimal  solution  to  a  well-posed  minimization 
problem,  as  shown  in  Sections  6.1  and  6.2.  In  addition,  the  OPCV 
provides  unique  insight  into  the  structure  of  orthogonal 
projections  in  the  context  of  STAP  applications. 
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7.0  TWO-DIMENSIONAL  ARLS  AND  PAMF  DETECTION  FOR  STAR 


The  auto-regressive  (AR)  subclass  of  2-D,  linear,  shift- 
invariant  parametric  time  series  models  was  selected  to  represent 
the  channel  output  under  the  null  hypothesis,  with  its  attendant 
increase  in  modeling  degrees-of -freedom  (independent  dynamic  and 
static  modeling  capability  along  each  axis)  over  the  1-D  vector 
representation.  The  complete  time  series  model  class  (MA,  AR,  and 
ARMA  models)  was  considered,  but  the  AR  subclass  and  the  2DARLS 
model  identification  algorithm  in  particular  provided  major 
computational,  software  implementation,  and  modeling  performance 
advantages.  Since  each  member  of  the  AR  model  subclass  is 
invertible,  the  2-D  AR  least-squares  (2DARLS)  algorithm  inherently 
generates  the  whitening  filter  used  in  the  2-D  PAMF  detection 
architecture  presented  herein. 

7.1  Two-Dimensional  ARLS  Model  Identification 


The  channel  output  vector  sequence  Wn))  can  be  viewed  as  a 
scalar  2-D  sequence  (Dudgeon  and  Mersereau,  1984) .  Let  channel  J 
be  the  temporal  and  spatial  reference  for  the  array,  and  define 
the  following  association, 

(7-1)  Xj.|^(n)  =  x(n,k)  0 < n < N-1;  0 < k< J-1 

The  process  {x(n,k)}  has  a  scalar  2-D  power  spectrum  denoted  as 
and  a  scalar  2-D  auto-covariance  sequence  (ACS)  denoted 
as  {rxx(fD,.£)} ,  where  is  the  lag  pair  corresponding  to  the 

normalized  frequency  pair  • 

The  association  defined  by  Equation  (7-1)  was  adopted  for 
Phase  I  and  its  usage  is  continued  in  Phase  II  because  it  is 
consistent  with  the  relation  between  multichannel  1-D  and  scalar 
2-D  systems  demonstrated  by  Therrien  (1981).  Alternatives  to 


75 


Equation  (7-1)  can  be  defined,  such  as  {x,^^^(n)  =  x(n,k)  |  n  =  0,  1 ,  .  .  .  ,  N-1 ;  k 
This  specific  alternative  definition  corresponds  to 
the  MATLAB  software  default  array  definition  convention. 

Model-based  detection  as  considered  herein  is  predicated  on 
the  representation  of  the  channel  output  process  {x(n,k)}  as  the 
output  of  a  2-D  time  series  model  driven  by  white  noise. 
Furthermore,  the  time  series  model  output  is  corrupted  by  additive 
white  noise.  To  be  precise,  such  a  representation  is  approximate 
in  the  case  of  radar  data.  Nevertheless,  in  practice  time  series 
models  have  been  shown  to  provide  a  good  fit  to  radar  data.  Thus, 
the  2-D  process  {x(n,k)}  is  assumed  to  be  represented  as 

(7-2)  x(n,k)  =  y(n,k)  +  w(n,k) 

where  y(n.k)  is  the  output  of  a  linear,  shift -invariant,  2-D  AR  time 
series  model,  and  w(n,k)  is  a  2-D,  zero-mean,  Gaussian-distributed, 
white  noise  process.  Processes  {y(n,k)}  and  {w(n,k)}  are  independent . 

Consider  now  the  2-D  AR  representation  for  the  process  {y(n,k)}. 
A  2-D  AR(Pd,P  s)  process  {y(n,k)}  is  defined  as 

Pd  Ps 

(7-3)  y(n,k)  =  -  ^  ^a*(id,is)y(n-id.k-is)  +  u(n,k) 

id=1  is=‘' 

(id.'s)^(0,0) 


where  the  input  {u(n,k)}  is  a  2-D,  zero-mean,  Gaussian-distributed, 
white  noise  process  with  variance  ,  {a(id,is)  Md  =  0,  1 ,  .  .  .  ,  Pdl  ig  =  0,  1 , .  .  .  , 
Psi  (id.y  ^  (0.0)}  are  complex- valued,  constant  coefficients  referred  to 
as  the  AR  parameters,  and  Pd,  Ps  a^re  the  model  order  parameters 
along  the  time  (Doppler)  and  space  dimensions,  respectively.  The 
transfer  function  associated  with  model  (7-3)  is  obtained  as  the 
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2-D  Z-transform  of  Equation  (7-3) .  Specifically,  the  transfer 
function  is  a  2-D,  complex- valued,  rational  function  of  the  form 


(7-4) 


T(Zd,Zs)  = 


ACZd.Zg) 


Pd  Ps 


£  Xa*(Pd-Ps)Zd^''Zs^^ 

Pd=0  Ps=0 


(7-5)  a(0,0)  =  1 

with  complex-valued  variables  Zd  and  Zg,  where  A(Z|j,Zg)  is  the  AR 
scalar  2-D  polynomial,  and  bo  is  a  real -valued  scalar  coefficient. 
As  indicated  in  Equation  (7-5),  the  leading  coefficient  of  A(Zd,Z3) 
is  unity.  This  follows  from  Equation  (7-3),  and  is  the  2-D 
version  of  a  monic  polynomial  in  1-D.  The  2-D  frequency  response 
of  model  (7-3)  ,  denoted  herein  as  T(fd.g.  is  obtained  by 
restricting  the  complex  variables  Zd  and  Zg  to  the  unit  surface  in 
the  complex  2-D  plane, 

(7-6)  Zd  =  ej^’^'i 

(7-7)  Zg  =  e*^"^® 

with  and  fg  the  normalized  temporal  (Doppler)  and  spatial 
frequencies,  respectively. 

In  the  context  of  2-D  FAME  processing  for  surveillance  phased 
array  radar  systems,  the  total  interference  (jamming  and  ground 
clutter)  is  represented  by  a  process  {y(n,k)}  of  the  form  (7-3)  ,  and 
the  receiver  noise  is  represented  by  a  white  noise  process  {w(n,k)} 
as  defined  above.  Thus,  it  is  required  to  identify  model  order 
and  coefficient  parameters,  (Pd.Ps)  {3('d>U}'  respectively,  given 

the  secondary  data.  Having  identified  the  model  parameters  which 
represent  the  channel  output  process,  the  associated  2-D  whitening 
(inverse)  filter  is  available  directly.  Specifically,  given  the 
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channel  output  2-D  sequence,  {x(n,k)},  the  whitening  filter  residual 
sequence,  {£(n,k)},  is  obtained  as 


(7-8)  e(n,k)=  ^a*(id,i3)x(n-id,k-i5) 

id=0  is=0 

with  a(0,0)  =  1.  Notice  that  the  whitening  filter  is  a  2-D  MA 
system,  with  {a(id.is)}  as  the  MA  coefficients  (an  MA  system  is  the 
system  inverse  of  an  AR  system,  and  vice-versa) . 

A  2-D  model  of  the  form  in  Equation  (7-3)  is  referred  to  as  a 
first-quadrant  system  since  only  values  in  the  first  quadrant  of 
the  output  2-D  plane  (except  for  initial  conditions)  are  used  to 
generate  the  system  output.  Model  (7-3)  is  causal,  in  loose 
analogy  with  the  1-D  case,  since  only  past  outputs  and  present 
inputs  are  used  to  generate  the  present  output.  Similarly,  its 
system  inverse  in  Equation  (7-8)  is  causal  also.  In  the  phased 
array  radar  space-time  problem  causality  along  the  time  axis  is  an 
inherent  feature  of  the  channel  output.  In  contrast,  the  issue  of 
causality  along  the  spatial  axis  appears  ambiguous  because  all 
channels  generate  an  output  at  each  temporal  sampling  instant. 
The  ambiguity  is  removed  by  considering  the  phase  reference  point 
to  be  at  array  element  k  =  0.  Model  (7-3)  is  also  recursively 
computable  (Dudgeon  and  Mersereau,  1984),  which  simplifies 
hardware  implementation.  Other  region  of  support  options,  such  as 
the  non- symmetric  half  plane  (NSHP) ,  are  of  interest  and  should  be 
considered  in  future  efforts. 

Model  (7-3)  is  causal  and  causally-invertible .  Thus,  if  the 
noise  {w(n,k)}  in  Equation  (7-2)  is  zero  and  model  (7-3)  represents 
the  channel  output  exactly,  then  {x(n,k)}  =  {y(n,k)}  (see  Equation  (7-2) ) 
and  the  residual  {e(n,k)}  of  {x(n,k)}  is  a  true  innovations  sequence. 
For  any  other  conditions,  the  filter  residual  {8(n,k)}  approximates 
an  innovations . 
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Two-dimensional  time  series  models  have  several  features 
distinct  from  their  1-D  counterparts.  The  three  distinctions  most 
relevant  to  the  problem  considered  herein  are  summarized  next. 
The  first  key  distinction  is  that  most  2-D  polynomials  are  not 
factorable.  Thus,  a  polynomial  A(Zd,Z3)  identified  by  an  algorithm 
is  likely  to  be  unf actorizable .  This  complicates  key  issues  such 
as  stability  determination.  In  2-D  systems,  poles  and  zeros  can 
occur  as  functions  rather  than  as  an  isolated  point  (Dudgeon  and 
Mersereau,  1984),  and  a  2-D  system  is  stable  if  the  loci  of  the 
pole  is  inside  the  unit  circle.  It  is  important  to  note  that  all 
cases  modeled  in  Phase  II  using  the  2DARLS  algorithm  resulted  in  a 
stable  2-D  model,  which  is  also  true  for  the  1-D  multichannel 
cases  considered.  The  second  major  distinction  is  that  2-D  models 
offer  more  dynamic  as  well  as  static  modeling  degrees-of -freedom. 
Recall  that  a  1-D  vector  system  has  isolated  (single-point)  poles 
and  zeros  only  for  the  temporal  dimension  (a  1-D  multichannel  AR 
system  does  have  system  zeros)  ,  whereas  the  2-D  AR  system  in 
Equation  (7-3)  has  poles  for  both  time  and  space,  and  those  poles 
are  generalized  into  loci  rather  than  isolated  points  (as  stated 
above)  .  The  space  dimension  in  a  1-D  vector  AR  system  is  non¬ 
dynamic  (has  no  poles)  and  with  fixed  value  J  (in  the  context 
considered  herein)  ,  whereas  in  a  2-D  scalar  AR  system  the  space 
dimension  is  dynamic  and  with  model  order  Pg  that  can  be  different 
from  J.  When  Ps<J,  the  dimensionality  of  the  LS  equations  to  be 
solved  is  smaller  for  2-D  scalar  AR  systems  than  for  1-D  vector  AR 
systems;  however,  the  increased  dimensionality  of  the  LS  equations 
in  a  1-D  multichannel  AR  system  (when  Pg<J)  does  not  result  in  an 
increase  in  dynamic  degrees-of -freedom  (number  of  poles) .  The 
third  distinction  of  relevance  is  that  causality  and  region-of- 
support  issues  offer  various  alternatives  for  2-D  systems,  in 
contrast  with  a  single  option  for  1-D  systems.  Region-of -support 
options  must  be  considered  keeping  in  mind  that  causality  is  to  be 
preserved  if  the  innovations  property  is  desired. 
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Estimation  of  the  parameters  for  a  2-D  scalar  time  series 
models  has  been  addressed  by  several  authors  (see  [Marple,  1987], 
[Dudgeon  and  Mersereau,  1984] ,  [Therrien,  1986]  and  the  references 
therein) .  For  AR  models  most  algorithms  require  generation  of  the 
ACS  of  the  process,  and  the  algorithms  are  extensions  of  the  1-D 
case.  In  contrast  with  the  1-D  case,  for  2-D  systems  the  maximum 
entropy  method  (MEM)  is  not  equivalent  to  the  AR  method.  Further, 
the  MEM  parameters  are  obtained  via  optimization  procedures,  with 
their  attendant  convergence  and  other  such  difficulties.  Most  2-D 
model  identification  algorithms  are  formulated  in  a  stochastic 
framework  and  require  the  ACS  (in  practice,  only  an  estimate  of 
the  ACS  is  available)  .  One  exception  is  the  2DARLS  method 
selected  in  Phase  II  and  summarized  next.  The  2DARLS  is  based  on 
minimizing  the  mean-squared  error  in  fitting  a  model  to  data.  As 
such,  it  can  be  viewed  as  a  statistical  approach.  Both  approaches 
(stochastic/probabilistic  and  statistical)  lead  to  identical 
results  in  the  case  of  Gaussian-distributed  data.  Thus,  the 
statistical  framework  is  more  general  because  it  can  be  applied  in 
cases  where  the  data  satisfies  a  distribution  other  than  Gaussian. 

In  the  presentation  below  the  2DARLS  identification  algorithm 
is  formulated  for  the  ideal  case  wherein  the  noise  process  w(n,k)  is 
zero.  In  other  words.  Equation  (7-2)  is  assumed  to  reduce  to  x(n,k) 
=  y(n,k),  so  the  formulation  can  be  defined  in  terms  of  the  channel 
output  process,  x(n,k).  However,  Equation  (7-2)  is  a  better 
representation  of  conditions  in  practical  scenarios.  Fortunately, 
the  2DARLS  algorithm  is  robust  to  the  presence  of  additive  white 
noise . 

Consider  the  problem  of  fitting  an  AR(Pd.  Pg)  model  to  the 
zero-mean,  stationary  (at  least  in  the  wide  sense) ,  2-D  finite- 

duration  sequence  {x(n,k)  I  n  =  0,  1 . N-l;  k  =  0,  1,  ...  ,  J-1}.  That  is,  it  is 

desired  to  obtain  a  set  of  constant,  complex- valued,  scalar 
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coefficients  {a(ic,is)  |  id  =  0,  1 ,  .  .  .  ,  Pdl  ig  =  0,  1 ,  .  .  .  ,  Pg;  (id, is)  (0,0)},  and  a 
scalar  variance  such  that  the  random  process  {x(n,k)}  (of  which 

the  given  sequence  is  a  particular  finite-duration  realization) 
satisfies  the  following  relation, 

Pd  Ps 

(7-9)  x(n,k)  =  -  £  £a*(id.is)x(n-id,k-is)  +  u(n,k) 

'd=''  's=1 

(id.is)^(O'O) 


where  {u(n,k)}  is  the  zero-mean,  complex- valued,  input  white  sequence 
with  variance  =  E[u(n,k)u*(n,k)].  In  the  2DARLS  algorithm  the 

unknown  parameters,  and  {a(id,is)  ]  id  =  0,  1,  .  .  .  ,  Pdi  ig  =  0,  1,  .  .  .  ,  Pg',  (id>is)  ^ 

(0.0)},  are  estimated  via  minimization  of  the  square  of  the  error  in 
fitting  model  (7-9)  for  a  fixed  model  order  pair  (Pd.Ps)  to  a  given 
set  of  secondary  data. 

The  2DARLS  method  formulation  is  as  follows.  Consider  first 
the  case  where  the  secondary  data  consists  of  only  one  element; 
that  is,  K=1.  Let  £(n,k)  denote  the  forward  linear  prediction  error 
variable  of  the  AR(Pd,P  g)  model , 

P  P 

(7-10)  e(n,k)  =  x(n,k)+ £  £a*(id,ig)x(n-it,,k-is)  Pd<n<N-1;  Pg<k<J-1 

'd~^ 

(id,is)5t(0,0) 


The  function  to  be  minimized  is  the  following  real-valued  scalar 
function  of  the  forward  linear  prediction  error  variable, 

N-1  J-1  N-1  J-1 

(7-11)  J(a)=  ^  ^e*(n,k)e(n,k)=  ^  ^e(n,k)e*(n,k) 

n=Pj  k=P5  n=Pjj  k^Pg 

where  a  is  the  following  ((Pd+1)(Ps+1)-1  ) -element  column  vector. 
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(7-12) 


a  =  [a(Pd,Ps)  a(Pd,Ps-1)  a(Pd,0)  ...  a(0,P3)  a(0,Ps-1)  a(0,1)f 

In  standard  optimization  theory,  function  J(a)  is  referred  to  as  the 
Jacobian.  The  forward  linear  prediction  squared  error,  denoted  as 
Zf,  is  defined  as  total  squared  error  in  the  linear  fit,  and  for 
the  special  case  of  scalar  sequences  it  is  obtained  as 


(7-13) 


N-1  J-1 

I,  =  ^  ^£*(n,k)e(n,k) 


N~1  J-1 

^  ^e(n,k)e*(n,k) 


It  follows  from  Equations  (7-11)  and  (7-13)  that  Ej  =  J(a) .  If  the 
AR(Pd.Ps)  model  is  an  appropriate  fit  to  the  data,  the  prediction 
squared  error  is  an  estimate  of  the  variance  of  the  input  noise. 
Equivalently,  the  prediction  squared  error  is  also  an  estimate  of 
the  variance  of  the  prediction  error;  specifically. 


(7-14) 


-2 

=^u 


(N-Pd)(J-Ps) 


In  order  to  solve  the  minimization  problem  postulated  above,  it  is 
convenient  to  define  the  following  {(Pd+'l)(Ps+^))  -element  column 
vector, 

(7-15)  x(n-Pd:n,k-Ps;k)  =  [x(n-Pd,k-Ps)  ...  x(n-Pd,k)  ...  x(n,k-Ps)  ...  x(n,k)]^ 

Given  these  Definitions  (7-12)  and  (7-15),  the  forward  linear 
prediction  error  in  Equation  (7-10)  can  be  expressed  as, 

(7-16)  e(n,k)  =  [a^  l]  xCnin-P^^.kik-Pg) 

and  the  function  to  be  minimized,  J(a),  is  expressed  as 
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(7-17)  J(a)  =  [a^  l] 


N-1  J-1 


^  ^x(n:n-P^,k:l<-P3)x>:n-P^,k:k-P3) 


n=Pd  k=P3 


Now  let  a  ((Pd+1)(Ps+1M(Pd+1)(Ps+1))  matrix  R  represent  the  double 
summation  of  outer -products  (dyads)  in  Equation  (7-17) ;  that  is, 


N-1  J-1 

(7-18)  R=  Y,  2x(n:n-P^,k:k-Pg)x^(n:n-Pjj,k:k-P5) 


Notice  that  matrix  R  is  Hermitian.  Now  further  define  the 
following  partitions  in  R, 


(7-19) 


R  = 


R 


11 

pH 

M2 


where  is  ((Pd+1)(Ps+1)-1)x((Pd+1)(Ps+1)-1).  Ii2  is  ((Pd+1)(Ps+1)-1)x1, 

and  ^22  is  a  scalar.  Now  the  function  to  be  minimized  is 
represented  simply  as 


(7-20)  J(a)  =  a^R-|ia  +  ry2i  +  §^ti2  ■*■*’22 


which  is  a  real-valued  scalar  function  of  the  complex- valued 
coefficient  vector,  a. 

Standard  optimization  theory  applied  to  function  i(a)  in 
Equation  (7-20)  leads  to  an  equation  of  the  form 

(7-21)  P-j-ja^  — r.i2 


The  set  of  linear  equations  (7-21)  can  be  solved  for  a  using  any 
one  of  various  techniques  (including  Cholesky  factorization  since 
matrix  R.|.|  is  Hermitian)  .  Such  a  solution  is  of  the  form 
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(7-22) 


-  -  *^11-12 

Once  the  optimum  coefficients  are  available,  the  forward  linear 
prediction  error  can  be  obtained  using  Equations  (7-13)  and  (7- 
20) ;  specifically, 

( 7-23 )  =  J°(a)  =  +  fgg  =  r22  - 112^^71^12 

where  J°(a)  denotes  the  optimum  value  of  the  Jacobian.  Thus,  the 
standard  approach  to  the  ARLS  method  is  to  implement  Equations  (7- 
22)  and  (7-23)  . 


A  preferred  alternative  (from  accuracy  and  computational 
considerations)  to  the  approach  outlined  above  is  presented  in 
Appendix  A  for  the  1-D  multichannel  ARLS.  With  straightforward 
modifications,  the  approach  in  Appendix  A  applies  also  to  the 
2DARLS,  and  is  summarized  next.  The  first  step  is  to  combine 
Equations  (7-21)  and  (7-23)  into  a  single  matrix-vector  equation. 


a’ 

[Rii 

1 - 

cvi 

a 

■q" 

1 

H 

L!-12 

- 1 

C\J 

CM 

1 

From  the  structure  of  matrix  R  as  presented  in  Equation  (7-18) ,  it 
follows  that  matrix  R  admits  a  factorization  of  the  form 

(7-25)  R  =  X'^X 

where  X  is  the  data  matrix  of  the  normal  equations  in  least- 
squares  linear  prediction.  Appendix  A  states  the  conditions  that 
must  be  satisfied  for  matrix  X  to  have  full  rank.  Assuming  those 
conditions  are  satisfied,  then  matrix  X  admits  a  QR  decomposition 
of  the  form 
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{1-26) 


X  =  QU 


where  Q  is  a  unitary  matrix,  and  U  is  a  complex- valued,  full-rank, 
upper-triangular  matrix.  It  follows  from  Equations  (7-25)  and  (7- 
26)  that 

(7-27)  R  =  X^X  =  U^U 

Matrix  U  admits  a  partitioning  analogous  to  that  in  Equation  (7- 
19) ;  that  is, 

Uii  yi2 

[0]  U22 

[0]  [0] 

where  is  a  full-rank  upper- triangular  matrix,  U22  is  a  non-zero 

scalar,  and  ^^2  i®  ®  column  vector  with  no  particular  properties. 
It  follows  from  Equations  (7-24) ,  (7-27) ,  and  (7-28)  that 

uI'iU,,  uC,y,2 

^12^11  ^12^12 ’*'^22*^22,  -t 

Equation  (7-29)  separates  into  the  following  two  equations, 

(7-30)  = 

(7-31)  Ef  =  y|^2Uil§  +  yi2yi2  ■'■’-’22^22 

H 

Elimination  of  U11  from  both  sides  of  Equation  (7-30)  leads  to 
(7-32)  U.,.|a  =  -U.,2 


(7-29) 
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and  then  a  can  be  obtained  as 

(7-33)  a  = 

Substitution  of  Equation  (7-33)  into  Equation  (7-31)  results  in 
(7-34)  Zj=U22U22 

Equations  (7-33)  and  (7-34)  constitute  the  desired  solution.  A 
preferred  alternative  to  the  inverse  calculation  in  Equation  (7- 
33)  is  to  solve  Equation  (7-32)  using  back- substitution  since 
matrix  is  upper- triangular .  Back-substitution  is  both  accurate 

and  efficient.  Thus,  the  major  computational  load  involved  in 
solving  for  a  and  Zj  is  the  calculation  of  the  QR  decomposition. 

For  the  case  where  the  secondary  data  set  has  more  than  one 
element ,  K>  1,  the  four  processing  options  identified  in  Appendix 
A  apply  directly.  However,  analyses  carried  out  to  date  indicate 
that  the  block  approaches  generate  better  parameter  estimates. 

7.2  Two-Dimensional  PAMF  Detection  for  STAP  Applications 

The  2-D  PAMF  is  an  extension  of  the  1-D  multichannel  PAMF. 
In  fact,  the  structure  is  analogous  to  that  described  by 
Rangaswamy  and  Michels  (1997)  and  by  Roman  et  al .  (1998),  as 

suggested  by  inspection  of  the  block  diagram  in  Figure  7-1. 
However,  the  entries  of  the  blocks  in  Figure  7-1  are  different 
from  those  corresponding  to  the  PAMF.  Most  significantly,  in  the 
2-D  PAMF  the  parameters  are  estimated  using  the  2DARLS  algorithm, 
and  the  whitening  filter  is  a  2-D  scalar  filter  which  carries  out 
spatial  and  temporal  whitening  jointly.  In  the  PAMF  the  whitening 
is  carried  out  in  two  steps:  first  temporal  and  then  spatial. 
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with  respect  to  Figure  7-1,  the  unit-variance  2-D  sequence 
{v(n,k)}  is  obtained  from  the  whitened  sequence  {e(n,k)}  as 

(7-35)  v(n,k)  =  e(n,k)  0  <  n  <  N-P^-1;  0  <  k  <  J-P -1 

and  the  notation  v(n,k)  <->  v  indicates  conversion  from  2-D  sequence 
into  column  vector,  following  any  convention  for  the  mapping  of 
the  sequence  elements  into  a  vector.  Finally,  the  test  statistic 
is  calculated  as 


(7-36) 


sHy 

2 

S^S 


Notice  that  this  expression  is  analogous  to  the  vector  form  of 
Equation  (2-17)  ,  but  the  vector  variables  have  very  distinct 
meaning . 


PRIMARY 


Figure  7-1.  Two-dimensional  PAMF  STAP  and  detection  architecture. 
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8.0  SUMMARY  AND  FUTURE  WORK 


The  generic  STAP  and  detection  architecture  introduced  herein 
covers  the  MF  as  well  as  three  new  STAP  algorithms,  orthogonal 
projection  (OP) ,  maximum  correlation  (MC) ,  and  OP  using  canonical 
variables  (OPCV) .  These  algorithms  have  potential  for  significant 
dimensionality  reduction,  while  being  based  on  the  optimization  of 
probabilistic  criteria. 

Future  work  includes  further  development  of  the  theoretical 
relationships  between  the  STAP  methods  herein  and  others,  such  as 
the  multistage  Wiener  filter.  Another  aspect  of  future  work  is 
the  software-based  analysis  of  detection  performance  of  the  three 
methods  for  a  variety  of  configurations  under  known -covariance  as 
well  as  unknown- covariance  conditions.  For  each  STAP  algorithm 
several  distinct  test  statistics  can  be  defined,  and  each  test 
statistic  exhibits  different  performance  properties,  including 
CFAR.  The  application  of  alternative  test  statistics  should  be 
investigated.  In  the  context  of  the  adaptive  (unknown-covariance) 
formulation  of  the  algorithms,  the  utilization  of  structured 
covariance  estimates  instead  of  sample  covariance  estimates  should 
be  investigated  also. 

The  2-D  PAMF  constitutes  a  significant  extension  to  the  PAMF 
that  may  yield  computational  and  or  performance  advantages  in 
relation  to  the  PAMF-LS  (which  is  the  leading  candidate  among  the 
alternative  PAMF  implementations) .  Computational  advantages  are 
possible  specially  for  cases  wherein  the  number  of  channels  is 
large  because  the  PAMF-LS  channel  requires  identification  of  P  JxJ 
matrices,  whereas  the  2DARLS  requires  identification  of  PgXPg 
matrices  with  P^^  =  P  and  Pg  <  J .  Performance  advantages  may  arise 
due  to  the  enhanced  modeling  capability  of  the  2DARLS  along  the 
spatial  axis. 
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Future  work  for  the  2-D  PAMF  includes  software -based  analysis 
of  detection  performance,  specially  in  relation  to  other  methods, 
including  the  1-D  multichannel  PAMF-LS.  Such  analyses  should 
include  also  the  application  of  alternative  test  statistics. 
Recent  work  by  SSC  personnel  in  detection  of  anti-tank  mines  using 
ground  penetrating  radar  demonstrates  the  potential  for  wide 
applicability  of  the  2-D  PAMF  and  other  related  methods  to  images. 
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APPENDIX  A. 


AUTO-REGRESSIVE  LEAST- SQUARES  MODEL  IDENTIFICATION 


This  appendix  presents  the  least-squares  (LS)  identification 
algorithm  for  multichannel  auto-regressive  (AR)  processes.  Also 
presented  are  the  options  adopted  for  its  software  implementation 
in  the  context  of  the  parametric  adaptive  matched  filter  (PAMF) 
for  space-time  adaptive  processing  (STAP)  in  airborne  surveillance 
phased  array  radar  systems.  In  addition,  simulation  analysis 
results  are  presented  for  the  algorithmic  options  discussed 
herein.  The  presentation  herein  serves  as  documentation  for  the 
SSC-generated  MATLAB  function  arls.  The  algorithmic  discussions  are 
brief  since  details  are  available  in  texts  such  the  one  by  Marple 
(1987).  However,  software  implementation  aspects  unavailable 
elsewhere  are  discussed  herein. 

A. 1  Multichannel  Least- Squares  Formulation 

Consider  the  problem  of  fitting  an  AR  model  of  order  P  to  a 
zero-mean,  stationary  (at  least  in  the  wide  sense) ,  finite- 

duration  sequence  {x(n)  I  n  =  0,  1 . N-1},  with  x(n)  e  C^.  That  is,  it 

is  desired  to  obtain  a  set  of  constant  matrix  coefficients  {A(k)  I  k  = 
1,2,  ...  ,  P},  with  A(k)  e  and  a  covariance  matrix  Ryu  e  such 

that  the  random  process  {x(n)}  (of  which  the  given  sequence  is  a 
particular  finite-duration  realization)  satisfies  the  following 
equation, 

P 

(A-1)  x(n)  =  -  A*^(k) x(n  -  k)  +  u(n) 

k=1 


where  {u(n)}  is  the  zero-mean  input  white  sequence,  with  u(n)  e  and 
Ruu  =  E[u(n)  U«(n)] .  If  the  process  {x(n)}  satisfies  Equation  (A-1)  , 
then  it  is  said  to  be  an  AR  process  of  order  P,  and  is  denoted  as 

AR(P) . 
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The  unknown  parameters,  R^j  {A(k)},  are  estimated  via 

minimization  of  the  error  squared.  That  is,  let  e(n)  denote  the 
forward  linear  prediction  error  of  the  AR(P)  model , 

P 

(A-2)  e(n)  =  x(n)+^AV)x(n-k)  n  =  P,  P+1, ,  N-1 

k=1 


Then,  the  function  (Jacobian)  to  be  minimized  is  of  the  form 


N-1 

(A-3)  J  =  tr[5:f]  =  ^e^^(n)e(n) 

n=P 


where  the  forward  linear  prediction  error  matrix,  Xf,  is  defined  as 
N-1 

(A-4)  2f=21e(n)e» 

n=P 


If  the  AR(P)  system  is  an  appropriate  fit  to  the  data,  the 
prediction  error  matrix  is  an  estimate  of  the  covariance  matrix  of 
the  input  noise;  specifically, 


(A-5) 


1 


N-P 


Marple  (1987)  shows  that  minimization  of  the  function  J  leads  to 
the  following  block  matrix  linear  equation. 


Rx(o,o)  • 

•  Rx(0,P-1)  1  Rx(0,P)  ■ 

■A(P)- 

■[Ol¬ 

Rx(P-i,o)  ■ 

..  r^(p_i,p_i)!r^(P_1,P) 

A(1) 

io] 

Rx(P,0)  . 

Rx{P.P-1)  !  Rx(P,P)  J 
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where  Ij  e  is  an  identity  matrix,  and  Rx(i,j)  6  is  of  the  form 

N-1-P 

(A-7)  Rx(U)=  ^x(n  +  i)x^(n  +  j)  (i,j)  =  (0,0), . . . ,  (P,P) 

n=0 


The  purpose  of  the  partitions  introduced  explicitly  in  Equation 
(A“6)  is  made  clear  in  Section  A. 2.  In  compact  notation,  Equation 
(A- 6)  can  be  expressed  as 


A 

Ri2 

A 

r[o]i 

UjJ 

-^21 

R22. 

[ijJ 

with  Re  C'^(P+i)xJ(P+i)^  g  (^JPxJP^  (j^jpxj^  g  ^JxJP^  p^^g  <£jxj^ 

and 


(A-9) 


A(P) 

A(P-1) 

A(1) 


Equations  (A-6)  and  (A-8)  are  of  the  same  structure  as  the 
corresponding  multichannel  Yule-Walker  normal  equation,  except 
that  matrix  R  is  Hermitian  only  (whereas  the  corresponding  matrix 
in  the  Yule-Walker  formulation  is  both  Hermitian  and  block 
Toeplitz) .  The  diagonal  sub-matrices,  R^^  and  R22  ,  are  Hermitian 

also.  Furthermore,  for  appropriate  values  of  N  and  P,  matrices  R, 
R.|.|,  and  R22  are  positive  definite. 

The  formulation  presented  above  is  referred  to  as  the 
covariance  (or  non-windowed)  method  of  leas t - squares  linear 
prediction.  This  terminology  arises  from  its  early  usage  in 
speech  processing  applications,  and  is  misleading  in  the  context 
of  modern  stochastic  algorithms.  Specifically,  R  is  neither  a 
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structured  covariance  matrix,  nor  a  sample  (maximum  likelihood 
estimate)  covariance  matrix  (except  for  the  case  where  P  =  N-1 ,  as 
noted  in  Section  A.3) .  It  is  important  to  note  that  the  ordering 
of  the  AR  coefficients  in  block  matrix  A  and  the  partitioning  and 
definitions  in  Equations  (A-6)-(A-9)  differ  from  the  usual 
conventions  (Marple,  1987)  .  This  variation  is  adopted  herein 
based  only  on  considerations  of  the  software  implementation  of  the 
method,  and  there  is  strict  equivalence  between  the  results 
obtained  using  either  notation. 

The  structure  of  matrix  R  as  presented  in  Equations  (A- 6)  and 
(A-7)  allows  a  factorization  of  the  form 


the  data  matrix  of  the  normal  equations  in 
of  least-squares  linear  prediction,  and  is 

x^(1)  •••  x^(P) 

x^(2)  •••  x^(P  +  1) 

x^^(P  +  1)  x^^(2P) 

x^^(N-l-P)  x^^(N-P)  •••  x^^(N-l) 


(A-IO) 


R  =  X^X 


where  X  e  is 
the  covariance  method 
defined  as 


(A-11) 


X  = 


x^(0) 

xH{1) 

x^(P) 


The  form  in  Equation  (A-11)  corresponds  to  the  case  where  N»P. 

A. 2  Normal  Equations  Solutions 

The  normal  equations  (A-6)  (or  equivalently,  (A-8))  can  be 
solved  using  a  variety  of  techniques,  including  the  singular  value 
decomposition  (SVD) .  However,  two  specific  techniques  have 
features  that  are  important  in  the  context  of  interest  herein. 
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One  technique  operates  directly  on  the  data  matrix  X,  whereas  the 
other  exploits  the  Hermitian  property  of  matrix  R.  Each  technique 
generates  one  set  of  AR(P)  coefficients  given  one  finite-duration 
realization  of  the  array  output  process.  Both  techniques  require 
matrix  R  to  be  positive  definite  in  order  to  generate  a  complete 
solution  (both  A  and  Lf)  .  Conditions  are  established  which  insure 
that  R  is  positive  definite.  In  addition,  the  positive  semi- 
definite  case  is  considered  also. 

A. 2.1  DATA-BASED  SOLUTION 

Consider  first  the  case  where  the  number  of  rows  in  the  data 
matrix  X  is  equal  to  or  larger  than  the  number  of  columns;  that 
is,  when  the  following  condition  is  satisfied: 

(A-12)  N  -  P  >  J(P  +  1)  =  JP  +  J 

Condition  (A-12)  insures  that  rank[X]  =  J(P  +  1)  with  probability  one 
due  to  the  randomness  of  the  data  in  the  context  of  airborne 
surveillance  radar  applications.  With  Condition  (A-12)  satisfied, 
the  QR  decomposition  of  X  is  of  the  form 

(A-13)  X  =  QU 

where  Q  €  is  a  unitary  matrix,  and  U  e  is  an 

upper- triangular  matrix  with  rank[U]  =  J(P  +  1 ) .  It  follows  from 
Equations  (A- 10)  and  (A-13)  that 

(A- 14)  R=X'^X  =  U^U 

Matrix  U  admits  a  partitioning  analogous  to  that  in  Equation  (A- 
8) ;  that  is. 
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U11 

{A- 15)  U=  [0]  U22 

JO]  [0]. 

where  6  ^JPxJP  upper- triangular  with  rank[U^^]  =  JP ,  U22  e  is 

upper- triangular  with  rank[U22]  =  J,  and  U-|2  e  with  no  particular 

properties.  It  follows  from  Equations  (A-8) ,  (A-14),  and  (A-15) 

that 


(A  -le: 


u'i'M,  u”u,2  ir*i^r[®i 

^12^11  ^12^12  "^^22^22.  -'j. 


Equation  (A-16)  separates  into  the  following  two  equations. 


(A- 17)  u';'iU^iA  =  -u';^^Ui2 


( A- 18 )  If  =  U';^2Ui  iA  +  U',^2Ui2  +  U22U22 


Pre-multiplication  of  both  sides  of  Equation  (A-17)  by  U.j.|  leads 


(A-19)  UiiA  =  -Uf 


and  then  A  can  be  obtained  as 


(A-20)  A  =  -Ui.U. 


11^^12 


Substitution  of  Equation  (A-20)  into  Equation  (A-18)  then  results 


(A-21)  If=U22U2 


Equations  (A-20)  and  (A-21)  constitute  the  desired  solution. 
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Equation  (A-20)  expresses  the  solution  for  A  in  direct  form, 
which  is  convenient  for  obtaining  the  solution  for  Zj.  However,  an 

alternative  approach  to  a  matrix  inverse  calculation  followed  by  a 
matrix-matrix  multiplication  is  to  solve  Equation  (A-19)  via  back- 
substitution  because  U-|i  is  upper- triangular .  Since  U-|-t  is 
positive  definite,  this  alternative  is  the  most  efficient 
computationally  (Golub  and  Van  Loan,  1989) .  The  increase  in 
computational  efficiency  over  the  direct  inverse  approach  is 
proportional  to  the  number  of  columns  of  U^2- 


As  stated  above.  Condition  {A-12)  insures  that  the  data 
matrix  X  has  rank  equal  to  its  smaller  dimension  (column  rank) , 
and  thus  and  U22  are  both  positive  definite.  This  condition 

leads  to  an  upper  bound  for  model  order  (P) ,  given  N  and  J.  The 
upper  bound  condition  is  expressed  as 


(A-22) 


P  <  P _ =  fix 


max 


N-J  ~ 
J  +  1 


BOUND  FOR  BOTH  A  AND  Zf 


where  the  operator  fixM  truncates  a  real -valued  number.  This  bound 
is  very  tight,  and  implies  that  N  >  2J  +  1  so  that  >  1  . 

A  more  relaxed  bound  is  possible  if  only  the  AR  coefficients 
are  desired,  or  if  the  values  of  the  parameters  J,  N,  and  P  result 
in  a  data  matrix  X  with  fewer  rows  than  columns  (such  is  indeed 
the  case  when  N  ~  J,  even  if  P  is  small)  .  In  such  case  the 
condition  that  must  be  satisfied  is 


(A-23)  JP  +  J>N-P>JP 

where  the  second  inequality  follows  from  N  -  P  >  rank[R.|.(]  =  JP. 
Condition  (A-23)  insures  that  rank[X]  =  N  -  P  >  JP  with  probability  one 
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due  to  the  randomness  of  the  data.  Assuming  Condition  (A-23)  is 
satisfied,  the  form  of  matrix  U  in  the  QR  decomposition  of  X  is 


(A-24) 


U11  U^2 

[0]  U22 


where  e  is  upper-triangular  with  rank[U^J  =  JP,  and  g 

with  no  particular  properties,  as  before.  However,  now  U22  € 
^(N-P-JP)xJ  rank[U22]  =  N  -  P  -  JP  <J.  The  structure  of  U22  is  still 

upper- triangular ,  but  the  number  of  rows  is  less  than  the  number 
of  columns.  In  fact,  when  Condition  (A-23)  is  met  with  equality 
on  the  right  (that  is,  N  -  P  =  JP)  ,  matrix  U22  is  non-existent. 

Nevertheless,  the  AR  coefficients  are  obtained  by  solving  Equation 
(A-19) ,  as  before. 


The  second  inequality  (N-P>JP)  in  Condition  (A-23)  leads  to 
another  upper  bound  for  model  order,  given  N  and  J.  In  this  case, 
the  upper  bound  condition  is  expressed  as 


(A-25) 


P<P 


max 


=  fix 


N 

J  +  1 


BOUND  FOR  A  ONLY 


This  bound  is  less  tight  than  Bound  (A-22)  ,  and  implies  that  N>J  + 
1  so  that  P^3^  >  1  . 

A. 2. 2  COVARIANCE -BASED  SOLUTION 

As  in  Section  A. 2.1,  consider  first  the  case  where  Condition 
(A-12)  is  satisfied.  Since  X  has  full  rank  in  this  case,  it 
follows  that  R  is  positive  definite  (rank[R]  =  JP  +  J)  with 
probability  one  due  to  the  randomness  of  the  data  (using  the  SVD 
it  is  easy  to  show  that  rank[R]  =  rank[X])  .  Furthermore,  and  R22 

are  Hermitian  and  positive  definite  also  (rank[R.|.| ]  =  JP;  rank[R22]  =  J)  • 
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Based  on  the  partitions  defined  in  Equations  (A-6)  and  (A-8), 
the  normal  equations  separate  into  two  equations, 

(A-26)  R^^A  =  -R^2 

(A-27)  If=R22  +  R21^ 

As  for  the  data-based  solution.  Equation  (A-26)  is  utilized  to 
solve  for  A,  and  then  Zj  is  obtained  via  Equation  (A-27)  .  Equation 

(A-26)  can  be  solved  via  a  matrix  inverse  calculation  followed  by 
a  matrix-matrix  multiplication.  However,  since  R.|.|  is  Hermitian 

and  positive  definite,  a  more  efficient  approach  (based  on  the 
Cholesky  decomposition)  is  possible. 

The  Cholesky  decomposition  of  a  Hermitian,  positive  definite 
matrix  R.,.j  is  of  the  form 

(A-28)  R.,1=C^C 

Ipy  jp 

where  Ce  C  is  upper- triangular  with  real-  and  positive-valued 
elements  along  the  main  diagonal,  and  with  rank[C]  =  JP.  It  follows 
from  Equations  (A-26)  and  (A-28)  that 

(A-29)  C'^CA  =  -  R^2 

Given  the  triangular  structure  of  C,  matrix  A  is  solved  for  using 
back-substitution  twice.  Equations  (A-27 )- (A-29 )  constitute  the 
desired  solution. 

The  model  order  upper  bound  in  Equation  (A-22)  is  valid  also 
for  the  covariance  approach  since  rank  Condition  (A-12)  applies. 
Similarly,  the  bound  in  Equation  (A-25)  is  valid  also  for  the 
covariance  approach  when  Condition  (A-23)  holds  and  only  the  AR 
coefficients  are  required  ( rank[R]  =  rank[X]  holds  for  all  cases) .  When 
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Condition  (A-23)  is  satisfied  as  JP  +  J >  N  -  P > JP  (that  is,  without 
equality  on  the  right),  Equation  (A-27)  generates  a  non-zero  value 
for  L(,  but  such  solution  is  incorrect  because  it  is  generated  with 
a  rank  deficiency  in  the  covariance  matrix.  Furthermore,  when 
Condition  (A-23)  is  met  with  equality  on  the  right  (that  is,  N-P  = 
JP)  ,  the  matrix  generated  via  Equation  (A-27)  is  the  numerical 
null  matrix  because  the  rank  deficiency  is  maximum  (within  the 
limits  of  Condition  (A-23)).  If  Condition  (A-23)  is  violated  from 
the  right  (that  is,  if  N  -  P  <  JP)  ,  then  the  rank  defiency  is 
sufficiently  large  to  introduce  error  in  the  calculation  of  A 
also. 


A. 3  Surveillance  Scenario  Software  Implementation  Options 

As  in  Section  1.1,  consider  an  airborne  side-looking 
configuration  of  a  J-element  linear  phased  array  radar  for  moving 
target  detection.  In  this  context,  the  data  to  be  processed  is 
the  return  from  N  pulses  as  collected  by  the  J  array  elements  for 
each  one  of  Kj  range  bins.  This  JxNxKj  set  of  data  is  the  so- 
called  data  cube.  A  detection  decision  is  desired  for  each  range 
bin  in  the  data  cube.  Further,  the  decision  approach  is  adaptive 
in  order  to  account  for  possible  lack  of  statistical  stationarity 
in  the  data  cube.  The  favored  approach  is  sequential  in  nature; 
that  is,  each  range  bin  is  tested  in  a  pre-determined  order.  At 
each  decision  instant,  the  range  bin  to  be  tested  is  referred  to 
as  the  primary  data.  Also,  a  subset  of  K  <  K-p  range  bins  in  a 
neighborhood  near  the  primary  data  is  utilized  as  training  data 
for  the  processing  algorithm  and  the  detection  rule.  This  data 
set  is  referred  to  as  the  secondary  data.  The  range  bins  in  the 
secondary  data  set  are  assumed  to  provide  target-free,  independent 
realizations  of  the  array  output  process. 

The  PAMF  processing  and  detection  rule  utilizes  the  secondary 
data  to  generate  a  whitening  filter  for  the  primary  data,  and  then 
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applies  a  detection  rule  based  on  the  matched  filter  concept.  One 
possible  configuration  utilizes  the  least-squares  (LS)  algorithm 
to  identify  a  set  of  AR(P)  parameters  that  fit  the  secondary  data, 
and  then  utilizes  the  parameters  as  a  moving-average  filter  to 
process  the  primary  data.  The  filter  output  constitutes  a 
residual  sequence  to  which  the  detection  rule  is  applied.  Such 
configuration  is  referred  to  as  the  PAMF-LS. 

One  significant  aspect  of  the  surveillance  radar  context  is 
the  availability  of  more  than  one  realization  of  the  data  to  be 
utilized  for  the  model  identification  step  of  the  PAMF-LS.  Within 
such  context,  the  two  solution  approaches  in  Section  A. 2  lead  to 
three  distinct  software  implementation  options  of  interest.  These 
options  are  summarized  next,  and  a  descriptive  name  is  provided 
for  each.  The  given  names  link  each  algorithm  with  its  software 
implementation  in  a  MATLAB-based  function  generated  by  SSC  and 
named  arls. 

1.  COEFFQR:  QR  decomposition  of  data  matrices  and  model 

parameter  averaging. 

a)  Form  K  data  matrices,  one  for  each  secondary  data 
range  bin. 

b)  Generate  K  sets  of  model  parameters  (AR 
coefficients  and  prediction  error  matrices)  via  the 
data  matrix  QR  decomposition  approach. 

c)  Average  the  K  sets  of  model  parameters  to  obtain 
the  desired  solution. 

2.  COEFFCHOLESKY :  Cholesky  factorization  of  covariance 

matrices  and  model  parameter  averaging. 

a)  Generate  K  covariance  matrices,  one  for  each 
secondary  data  range  bin. 

b)  Generate  K  sets  of  model  parameters  via  the 
covariance  matrix  Cholesky  factorization  approach. 
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c)  Average  the  K  sets  of  model  parameters  to  obtain 
the  desired  solution. 

3.  DATABLKQR:  QR  decomposition  of  block  data  matrix. 

a)  Form  a  block  data  matrix  with  block  rows  consisting 
of  the  K  data  matrices  (one  data  matrix  for  each 
secondary  data  range  bin) . 

b)  Generate  the  desired  model  parameter  solution  via 
the  QR  decomposition  approach  applied  to  the  block 
data  matrix. 

4.  COVARCHOLESKY :  Covariance  averaging  and  Cholesky 

factorization  of  averaged  covariance. 

a)  Generate  K  covariance  matrices,  one  for  each 
secondary  data  range  bin. 

b)  Average  the  K  covariance  matrices  to  obtain  a 
single  covariance  matrix. 

c)  Generate  the  desired  model  parameter  solution  via 
the  covariance  matrix  Cholesky  factorization 
approach . 

Options  1  and  2  utilize  the  LS  solution  approaches  in  a  single¬ 
realization  context,  so  the  algorithmic  approaches  are  applied  as 
discussed  in  Section  A. 2.1.  The  rank  conditions  and  model  order 
upper  bounds  in  Section  A. 2.1  apply  also.  Options  3  and  4, 
however,  lead  to  a  different  rank  condition  and  model  order  upper 
bound . 

Consider  first  Option  4.  For  this  option,  the  covariance 
matrix  R  for  Equation  (A- 8)  is  generated  as  the  average  of  K 
single-range-bin  covariance  matrices;  namely. 
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(A-30) 


R  = 


where  X|^  denotes  the  data  matrix  for  the  kth  range  bin,  and  is 
the  covariance  for  the  kth  range  bin.  Covariance  matrix  R  as 
defined  in  (A-30)  is  a  valid  covariance  matrix  for  the  normal 
equations  (A- 8)  because  a  set  of  normal  equations  is  valid  for 
each  of  the  individual  single-range-bin  covariance  matrices.  As 
stated  previously,  the  covariance-based  solution  to  the  normal 
equations  (involving  both  A  and  Zf)  exists  and  is  unique  when  R 
has  full  rank,  rank[R]  =  JP  +  J .  The  conditions  under  which  R  attains 
full  rank  are  established  next. 


Under  Option  4,  the  conditions  under  which  R  attains  full 
rank  involve  the  system  parameters  J,  N,  and  P,  as  before,  but  K 
now  plays  a  key  role.  Consider  first  the  cases  where  N -  P  >  JP  +  J . 
For  these  cases  each  X|^  has  full  column  rank,  rank[Xk]  =  JP  +  J,  and 
consequently,  each  R^  has  full  rank,  rank[Rk]  =  JP  +  J  (see  Section 
A.  2. 2)  .  Matrix  R  has  full  rank  also  since  it  is  an  average  of  K 
independent,  full -rank  realizations  of  the  covariance  matrix. 

Consider  now  the  cases  where  N-P<JP  +  J.  For  these  cases 
each  Xk  has  full  row  rank,  rank[Xk]  =  N  -  P ,  and  consequently,  each  Rk 
is  rank-deficient,  rank[Rk]  =  N  -  P  (see  Section  A. 2. 2).  However,  the 
rank  of  R  increases  by  N  -  P  for  every  Rk  added  to  the  sum,  up  to 
the  maximum  possible  rank  value  of  JP  +  J .  In  other  words,  rank[R]  = 
min[(N  -  P)K,  (P  +  1)J].  Thus,  the  condition  for  R  to  have  full  rank 
under  Option  4  is 

(A-31)  (N-P)K>(P  +  1)J 

An  associated  model  order  upper  bound,  as  in  Equations  (A-22)  and 
(A-25) ,  is  meaningless  for  this  option.  In  fact.  Condition  (A-31) 
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allows  the  AR  model  order  to  be  as  large  as  P  =  Pmax  “  ^  ‘  "I  '  provided 
K  is  correspondingly  large  (K>JN  if  P  =  N-1).  This  set  of  values 
for  the  system  parameters  is  unlikely  to  be  adopted  in  the  context 
of  the  PAMF-LS  because  experience  shows  that  low  model  orders 
suffice  to  provide  excellent  detection  performance  in  most  cases. 
However,  it  is  appropriate  to  note  that  for  P  =  N-1  and  K>JN, 
matrix  R  generated  as  in  Equation  (A-30)  is  the  so-called  sample 
covariance  matrix,  and  is  the  maximum  likelihood  estimate  of  the 
covariance  matrix  of  vector  X  =  |x*^(0)  X^(1)  X*^(N-1)j  . 


Consider  now  Option  3,  the  QR  decomposition  of  the  block  data 
matrix.  Define  the  block  data  matrix  Xe  (j[;K(N-P)xJ(P+l) 


(A-32) 


Xi 


Xk 


It  is  simple  to  verify  that  this  block  data  matrix  is  related  to 
the  averaged  covariance  matrix  in  Equation  (A-30)  as 


{A-33) 


R  =  X^X 


The  leftmost  equality  is  of  the  same  form  as  Equation  (A-10) ,  so 
the  approach  defined  in  Section  A. 2.1  can  be  applied  directly. 
Specifically,  the  QR  decomposition  of  the  block  data  matrix  X  is 
generated  as  in  Equation  (A-13) ,  and  the  unitary  matrix  in  the 
decomposition  is  eliminated  from  the  formulation  as  in  Equation 
(A- 14) .  The  solution  for  the  model  parameters  is  obtained  using 
only  the  upper- triangular  matrix  factor  in  the  decomposition,  as 
in  Equations  {A-20)  and  {A-21) .  The  condition  for  X  to  have  full 
rank  is  identical  to  the  condition  for  R  to  have  full  rank; 
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namely,  Condition  (A-31) .  This  follows  from  the  leftmost  equality 
in  Equation  (A-33) .  The  various  comments  regarding  the  lack  of  AR 
model  order  bound  for  Option  4  apply  without  modification  to 
Option  3  also. 

From  a  structural  standpoint,  Options  3  and  4  can  be  used 
over  the  widest  set  of  conditions,  and  constitute  the  only 
alternatives  in  cases  where  J~N  (in  such  cases  Condition  (A-12) 
fails) .  The  relaxation  offered  by  Condition  (A-23)  for  Options  1 
and  2  is  important  in  the  application  context  considered  herein 
because  the  model  estimate  of  the  residual  covariance  matrix  is 
neglected  in  the  preferred  configuration  of  the  PAMF-LS  (the 
maximum  likelihood  estimate  or  the  time-averaged  estimate  are  used 
instead) .  In  terms  of  accuracy,  the  four  options  generate 
practically  identical  results  when  Condition  (A-12)  is  satisfied 
strongly;  namely,  N-P»JP  +  J.  Results  from  all  four  options  are 
very  similar  also  when  Condition  (A-12)  is  satisfied  with 
equality;  namely,  N  -  P  =  JP  +  J  .  From  a  computational  viewpoint, 
there  appears  to  be  a  set  of  system  parameter  values  (J,  N,  P,  K) 
wherein  each  algorithmic  option  executes  more  efficiently  than  the 
other  three.  A  detailed  operational  count  for  each  option  is 
required  in  order  to  address  this  issue  appropriately. 

The  model  order  upper  bound  for  the  LS  algorithm  presented  in 
[Roman  et  al . ,  2000]  is  based  on  Condition  (A-31),  and  is  correct 
from  an  algebraic  point  of  view.  However,  the  solution  approach 
adopted  in  the  paper  is  the  block  method  of  Option  3  (DATABLKQR) , 
and  for  both  block  methods  it  is  most  appropriate  to  view  the 
issue  of  model  order  as  a  condition  to  be  satisfied  rather  than  as 
a  bound,  for  the  reasons  stated  in  the  paragraph  following 
Condition  (A-31),  and  summarized  next.  Namely,  when  K>1  and  a 
block  method  is  used,  even  a  small  number  of  secondary  data  sets 
allows  a  solution  for  typical  scenario  parameters  (J  and  N)  and 
practical  values  of  model  order. 
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A. 4  Simulation  Analysis  Results 


Simulation-based  analyses  to  compare  the  LS  algorithmic 
approaches  have  been  carried  out  for  a  variety  of  system  parameter 
values  and  scenario  conditions,  and  a  subset  of  those  results  is 
presented  herein.  Of  specific  interest  are  results  obtained  for 
the  cases  considered  in  [Roman  et  al . ,  2000],  and  the  inclusion  of 
ground  clutter  temporal  correlation  model  type  as  another 
parameter  variation.  Specifically,  two  model  types  for  clutter 
temporal  correlation  are  considered:  exponential -shaped,  and 
Gaus  s i an - shaped . 

The  results  in  [Roman  et  al . ,  2000]  were  generated  using  the 
algorithmic  approach  of  Option  3.  In  most  analyses  carried  out  by 
SSC,  Options  1  and  2  generate  very  similar  results,  and  Options  3 
and  4  also  generate  similar  results.  Thus,  results  are  presented 
only  for  Options  1  and  4.  Furthermore,  the  cases  considered 
represent  only  the  J«N  Analysis,  and  only  one  value  of  output 
signal-to-interference-plus-noise  ratio  (SINR)  .  The  J  =  N  Analysis 
is  considered  in  [Roman  et  al.,  2000]  . 

The  conditions  and  parameters  for  the  J  «  N  Analysis  are :  J  =  4 
channels,  N  =  32  pulses,  Ovv  ~  ^  normalized  receiver  noise  standard 
deviation,  CNR=40dB  clutter-to-noise  ratio,  and  =  0.0  and  = 
0.3336  target  normalized  spatial  and  Doppler  frequencies, 
respectively.  Three  sets  of  scenario  conditions  and  two  values  of 
secondary  data  size  are  considered.  Case  1  is  for  Y=0deg  crab 
angle  and  no  jamming.  Case  2  is  for  y  =  20  deg  and  no  jamming. 
Case  3  is  for  y=0deg  and  two  point-source  barrage  jammers:  one 
jammer  is  at  fjg  =  -  0.35  jammer  normalized  spatial  frequency  with  JNR 
=  45  dB  jammer- to-noise  ratio,  and  the  other  jammer  is  at  fjg  =  0.2 
with  JNR  =  50  dB .  For  each  case  two  values  of  secondary  data  size 
are  considered:  K  =  2JN  =  256  (the  Brennan  rule-of  thumb  value  for  3 


105 


dB  performance),  and  K  =  2J  =  8.  For  this  Monte  Carlo  (MC)  analysis, 
Pp^  =  0.01  probability  of  false  alarm,  using  =  50  repetitions  of 

Npp^  =  2,000  independent  data  realizations  each.  Also,  P  =  3  model 
order  is  used,  as  in  [Roman  et  al . ,  2000]  . 

Tables  A-1  and  A-2  present  probability  of  detection  (Pq) 

results  for  the  three  simulation  cases  at  SINR  =  9  dB  and  the 

exponential -shaped  model  for  ground  clutter  temporal  correlation. 
The  sample  standard  deviation  of  the  Pp  estimates,  denoted  as 

SD[Pp],  is  presented  also  in  both  tables.  In  addition,  both  tables 

include  detection  results  for  the  optimal  matched  filter  (MF)  and 
the  constant  false  alarm  rate  (CFAR)  adaptive  matched  filter 
(AMF)  .  The  MF  results  are  analytical,  whereas  the  CFAR  AMF 
results  are  simulation-based.  Table  A-1  results  are  for  K  =  2JN  = 

256,  and  compare  with  the  respective  single  points  in  the  SINR  vs. 
Pp  curves  of  Figures  4,  6,  and  8  in  [Roman  et  al . ,  2000] .  Table 

A-2  results  are  for  K  =  2J  =  8,  and  compare  with  the  respective 
single  points  in  the  SINR  vs.  Pp  curves  of  Figures  5,  7,  and  9  in 

[Roman  et  al .  ,  2000]  . 

Examination  of  the  results  in  Tables  A-1  and  A-2  indicates 
that  both  PAMF-LS  implementations  out-perform  the  CFAR  AMF,  in 
terms  of  detection  as  well  as  variability  (smaller  standard 
deviation)  ,  for  both  values  of  K  (the  CFAR  AMF  values  in  both 
tables  are  for  K  =  2JN  =  256)  .  Notice  also  that  both  PAMF-LS 
implementations  perform  almost  identically  for  the  large  K  value, 
and  the  performance  is  close  to  that  of  the  optimal  MF.  However, 
for  the  small  K  value  the  PAMF-LS  using  covariance  averaging 
(Option  4)  out -performs ,  also  in  terms  of  detection  and 
variability,  the  PAMF-LS  using  coefficient  averaging  (Option  1) . 
Simulation-based  results  obtained  by  SSC  for  other  analyses  and 
conditions  reflect  the  same  trend. 
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CASE 

MF 

CFAR  AMF 

PAMF-LS 
COEFF  AVG 
(P  =  3) 

PAMF-LS 
COVAR  AVG 
(P  =  3) 

CRAB 

ANGLE 

(deg) 

NO.  OF 
JAMMERS 

Pd 

Pd 

SD[PJ 

Pd 

SD[PJ 

B 

SD[PJ 

0 

0 

0.8635 

0,4571 

0.0397 

0.8190 

0.0118 

0.0104 

20 

0 

0.8635 

0.4610 

0.0382 

0.8206 

0.0125 

0.8219 

0.0116 

0 

2 

0.8635 

0.4565 

0.0375 

0.8229 

0.0125 

0.8280 

0.0118 

Table  A-1.  Detection  performance  of  Options  1  and  4  of  the  PAMF- 
LS  for  the  three  simulation  cases  with  SINR  =  9clB,  K  =  2JN  =  256,  and 
the  exponential -shaped  model  for  clutter  temporal  correlation. 


CASE 

MF 

CFAR  AMF 
(K  =  2JN) 

PAMF-LS 
COEFF  AVG 
(P  =  3) 

PAMF-LS 
COVAR  AVG 
(P  =  3) 

CRAB 

ANGLE 

(deg) 

NO.  OF 
JAMMERS 

Pd 

Pd 

SD[Ppl 

Pd 

SDPJ 

Pd 

SD[PJ 

0 

0 

0.8635 

0.4571 

0.0397 

0.7542 

0.0270 

0.7835 

0.0690 

20 

0 

0.8635 

0.4610 

0.0382 

0.7573 

0.0893 

0.0575 

0 

2 

0.8635 

0.4565 

0,0375 

0.7609 

0.0812 

0.7822 

0.0661 

Table  A-2.  Detection  performance  of  Options  1  and  4  of  the  PAMF- 
LS  for  the  three  simulation  cases  with  SINR  =  9dB,  K  =  2J  =  8,  and 
the  exponential -shaped  model  for  clutter  temporal  correlation. 


Results  of  a  second  set  of  simulation  runs  with  identical 
parameters  and  scenario  conditions,  except  for  the  ground  clutter 
temporal  correlation  model,  are  presented  in  Tables  A-3  and  A-4. 
For  these  tables  the  Gaussian- shaped  model  was  adopted.  Table  A-3 
results  are  for  K  =  2JN  =  256,  and  Table  A-4  results  are  for  K  =  2J  = 
8.  Thus,  Tables  A-3  and  A-4  relate  to  Tables  A-1  and  A-2, 
respectively. 
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Examination  of  the  results  in  Tables  A-3  and  A-4  by 
themselves  leads  to  identical  conclusions  as  obtained  for  Tables 
A-1  and  A-2,  with  the  distinction  that  the  performance  of  the 
PAMF-LS  is  closer  to  that  of  the  MF.  A  comparison  of  Tables  A-3 
and  A-4  with  Tables  A-1  and  A-2,  respectively,  indicates  that  both 
PAMF-LS  implementations  perform  better  for  the  Gaussian-shaped 
clutter  temporal  correlation  model  for  both  values  of  K.  This  is 
in  contrast  to  results  observed  in  previous  analyses  involving 
other  PAMF  implementations.  Specifically,  both  the  PAMF  in 
conjunction  with  the  Strand-Nuttal  AR  model  identification 
algorithm  [Nuttall,  1976;  Strand,  1977]  and  the  PAMF  in 
conjunction  with  the  canonical  correlations  state  variable  model 
identification  algorithm  exhibit  degraded  performance  for  the 
Gaussian-shaped  clutter  temporal  correlation  model.  Other 
simulation-based  analyses  carried  out  by  SSC  further  support  these 
observations  and  conclusions. 


CASE 

MF 

CFAR  AMF 

PAMF-LS 
COEFF  AVG 
(P  =  3) 

PAMF-LS 
COVAR  AVG 
(P  =  3) 

CRAB 

ANGLE 

(deg) 

NO.  OF 
JAMMERS 

Pd 

B 

SD[PJ 

Pd 

SD[PJ 

B 

SD[PJ 

0 

0 

B 

0.4654 

0.0507 

0.8360 

0.0113 

0.8383 

0.0103 

20 

0 

0.8635 

0.4622 

0.0495 

0.7925 

0.0173 

0.8007 

0.0111 

0 

2 

0.8635 

0.4643 

0.0498 

0.8269 

0.0129 

0.8290 

0.0114 

Table  A-3.  Detection  performance  of  Options  1  and  4  of  the  PAMF- 
LS  for  the  three  simulation  cases  with  SINR  =  9dB,  K  =  2JN  =  256,  and 
the  Gaussian-shaped  model  for  clutter  temporal  correlation. 
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CASE 

MF 

CFAR  AMF 
(K  =  2JN) 

PAMF-LS 
COEFF  AVG 
(P  =  3) 

PAMF-LS 
COVAR  AVG 
(P  =  3) 

CRAB 

ANGLE 

(deg) 

NO.  OF 
JAMMERS 

Pd 

SD[Pp] 

Pd 

SD[PJ 

Pd 

SD[Ppl 

0 

0 

0.8635 

0.4654 

0.0507 

0.7829 

0.0670 

0.8054 

0.0517 

20 

0 

0.8635 

0.4622 

0.0495 

0.7417 

0.0937 

0.7932 

0.0699 

0 

2 

0.8635 

0.4643 

0.0498 

0.7798 

0.0724 

0.8035 

0.0626 

Table  A-4.  Detection  performance  of  Options  1  and  4  of  the  PAMF- 
LS  for  the  three  simulation  cases  with  SINR  =  9dB,  K  =  2J  =  8,  and 
the  Gaussian- shaped  model  for  clutter  temporal  correlation. 


In  closing,  it  is  important  to  mention  that  the  AR  model 
order  value  utilized  herein  is  the  same  value  utilized  in  [Roman 
et  al . ,  2000],  in  order  to  allow  direct  comparison  between  both 
sets  of  results.  Thus,  the  model  order  has  been  fixed  without  the 
benefit  of  optimization  for  the  covariance -averaging  algorithmic 
option  and  the  Gaussian-shaped  model  for  the  ground  clutter 
temporal  correlation.  Early  results  from  an  on-going  simulation- 

based  analysis  indicate  that  P  =  2  is  possibly  a  better  choice.  In 
particular,  for  the  y  =  20  deg  case  in  Table  A-3,  =  0.8236  and 

SD[Pp]  =  0.0101  using  P  =  2,  and  for  the  Y=20deg  case  in  Table  A-4, 

the  detection  and  variability  results  are  very  similar  using  P  =  2. 
Based  on  statistical  considerations,  the  lowest -order  model  that 
gives  acceptable  performance  should  be  adopted. 
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